A low-dimensional voice latent space derived from deep learning captures speaker-identity representations in the temporal voice areas and supports reconstruction of voices preserving identity ...
Our solution is largely based around 3D CNNs which we felt would generalise well to 'different' methods of deepfakes. Instead of focusing on specific pixel patterns in frames, we hope they might ...