We give two different and simple constructions for dimensionality reduction in
ℓ
2
via linear mappings that are sparse: only an
O
(
ε
)-fraction of entries in each column of our embedding matrices ...are non-zero to achieve distortion 1 +
ε
with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas 2003 and Dasgupta et al. 2010. Such distributions can be used to speed up applications where
ℓ
2
dimensionality reduction is used.
Let
Φ
∈
R
m
×
n
be a sparse Johnson–Lindenstrauss transform (Kane and Nelson in J ACM 61(1):4,
2014
) with
s
non-zeroes per column. For a subset
T
of the unit sphere,
ε
∈
(
0
,
1
/
2
)
given, we ...study settings for
m
,
s
required to ensure
E
Φ
sup
x
∈
T
‖
Φ
x
‖
2
2
-
1
<
ε
,
i.e. so that
Φ
preserves the norm of every
x
∈
T
simultaneously and multiplicatively up to
1
+
ε
. We introduce a new complexity parameter, which depends on the geometry of
T
, and show that it suffices to choose
s
and
m
such that this parameter is small. Our result is a sparse analog of Gordon’s theorem, which was concerned with a dense
Φ
having i.i.d. Gaussian entries. We qualitatively unify several results related to the Johnson–Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries. Our work also implies new results in using the sparse Johnson–Lindenstrauss transform in numerical linear algebra, classical and model-based compressed sensing, manifold learning, and constrained least squares problems such as the Lasso.