$\newcommand{\qed}{\tag*{\blacksquare}} \renewcommand\qedsymbol{\blacksquare} \newcommand{\1}{\mathbb{1}} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \newcommand{\x}{\times} \newcommand{\Z}{\mathbb{Z}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\R}{\mathbb{R}} \newcommand{\N}{\mathcal{N}} \newcommand{\E}{\mathop{\mathbb{E}}} \renewcommand{\bar}{\overline} \renewcommand{\epsilon}{\varepsilon} \newcommand{\bmqty}{\begin{bmatrix}#1\end{bmatrix}}$

# Learning With Errors

## Discrepancy: a constructive proof via random walks in the hypercube

By

In this post, I'll give two constructive/algorithmic discrepancy upper bounds. The first, by Beck and Fiala, applies to sparse set systems. The second, by Lovett and Meka, improves on the Beck-Fiala result and also matches the guarantees of Spencer's theorem.

## Discrepancy: definitions and Spencer's six standard deviations

By

This is a first in a series of (probably at least 3) blog posts about discrepancy minimization. There are already many expositions of this topic, and it is not clear that the world needs yet another, but here it is :). In this post, I'll introduce the basic definitions, give one simple upper bound, one simple lower bound, and then prove Spencer's famous (and elegant) “six standard deviations” theorem. I won't focus on mathematical context, but at the end I will give pointers to some resources on the topic.
Full post →

## Constructive Hardness Amplification via Uniform Direct Product

By
— post as [PDF]

This post was motivated by trying to understand the recent paper “Learning Algorithms from Natural Proofs”, by Carmosino-Impagliazzo-Kabanets-Kolokolova [CIKK16]. They crucially use the fact that several results in hardness amplification can be made constructive. In this post, we will look at the Uniform Direct Product Theorem of Impagliazzo-Jaiswal-Kabanets-Wigderson [IJKW10]. We will state the original theorem and algorithm of [IJKW10], then we will present a simpler analysis for a (weaker) non-uniform version of their algorithm, which contains some of the main ideas.

For a given function $f: \{0, 1\}^n \to \{0, 1\}^\ell$, say a circuit $C$ “$\eps$-computes $f$” if $C$ computes $f$ correctly on at least $\eps$-fraction of inputs. That is, $\Pr_x[C(x) = f(x)] \geq \epsilon$. We are interested in the following kind of direct product theorem (informally): “If function $f$ cannot be $\eps$-computed by any small circuit $C$, then the direct-product $f^{\ox k}(x_1, x_2, \dots x_k) := (f(x_1), f(x_2), \dots, f(x_k))$ cannot be computed better than roughly $\eps^k$ by any similarly small circuit.” 11. If this seems trivial, consider the $k=2$ case. We want to show that if $\Pr_x[C(x) = f(x)] \leq \epsilon$ for all small circuits $C$, then $\Pr_{x, y}[C'(x, y) = (f(x), f(y))] \lesssim \eps^2$ for all similarly small circuits $C'$. This is clearly true if the circuit $C'$ operates independently on its inputs, but not as clear otherwise (eg, the correctness of $C'$-s two outputs could be highly correlated). Indeed, proofs of the direct-product theorem take advantage of this correlation.

This is usually proved22. See the last section for good references to prior proofs. in contrapositive, by showing: If there exists a circuit $C'$ that $\eps^k$-computes $f^{\ox k}$, then there exists a similarly-sized circuit $C$ that $\eps$-computes $f$. The very interesting part is, this amplification can be made fully constructive, by a simple algorithm.
Full post →

## New Theory Blog

By

We’re starting a theory student blog! The idea is, this is a collaborative blog about theoretical computer science, where people can post about interesting things they’re learning / have learnt. The goal is to help everyone learn from each other, and also have a forum for student discussion.

Hopefully this will help make TCS concepts more accessible: Sometimes reading the original paper is not the best / most efficient way to learn something, and there are several perspectives or proofs of the same thing that are not explicitly written down in the literature, but are known in the community. We hope this blog will be a way to share this knowledge among theory students.

One important aspect is, we want this to feel like an informal place to learn and discuss – think more like chalk talks than STOC talks. It’s fine to have rough calculations, sketched figures, etc – the emphasis is on explaining things nicely. And we want to encourage asking clarifying questions and discussing in the comments.

## On posts:

• Posts can be about anything technical that other students may find interesting or learn from. Anything from current research to classical results.

• The length and thoroughness can vary, anything from “survey of this field” to “summary of cool paper” to “interesting technical lemma” to “something cool I learnt this week”, etc.

• You don’t need to be an expert on the topic to write about it (as the name suggests, there may be some errors, but hopefully also some learning).

• The aim is to convey interesting or useful techniques and intuition (e.g. try not to just announce a new result without explaining the ideas behind it)

## Contributing:

Everyone is welcome (and encouraged!) to contribute – including non-students, and generally anyone interested in TCS.

The easiest way is to simply write a LaTeX or Markup document, and email it to me (preetum [at] berkeley). There is also a harder way.

Ideally, both readers and writers would get something out of this blog. (Personally, I like to present topics to make sure I understand them fully. And of course, we can have an interesting discussion about it.)

## Comments and Subreddit:

We have comments below each post, which we encourage people to use to discuss the post.

We also have the subreddit r/LWE, which we hope can be used as a more general forum among theory students. Feel free to use this for both blog-related things and general theory questions. Let’s see how this works.

## Initial Posts:

We’re launching with posts on:

Thanks especially to the above people (and all future authors) for contributing.

## Conclusion and Open Questions

When conceiving this blog, we had some other ideas for things that should exist, such as a set of collaboratively-edited pages on “How to best learn topic X”. Are people interested in contributing to something like this? In general, any suggestions for things you would like to see (regarding this blog, or otherwise)?

Feel free to use the comments section below.

Full post →

## Pseudo-calibration for Planted Clique Sum-of-Squares Lower Bound

By
— post as [PDF]

Recently, Barak, Hopkins, Kelner, Kothari, Moitra and Potechin [BHKKMP16] proved an essentially tight Sum-of-Squares lower bound for the planted clique problem. Their result can be divided into two main parts: coming up with the pseudo-distribution and proving positivity of such pseudo-distribution. In this short blog, we summarize the first part of the paper, which provides a general systematic way to come up with pseudo-distributions for problems other than the planted clique problem, without going into details of the proof. We do not touch on the second part, which is more technically involved, here but we will hopefully do so in future posts.
Full post →

## Deterministic Sparsification

By

Let $G$ be a dense graph. A sparse graph $H$ is a sparsifier of $G$ approximation of $G$ that preserves certain properties such as quadratic forms of its Laplacian. This post will formally define spectral sparsification, then present the intuition behind the deterministic construction of spectral sparsifiers by Baston, Spielman and Srivastava [BSS08].

Full post →

## Intro to the Sum-of-Squares Hierarchy

By

This note is intended to introduce the Sum-of-Squares Hierarchy. We start by SDP relaxation, using the Goemans-Williamson Max-Cut SDP as a jumping off point. We then discuss duality and sum-of-squares proofs. Finally, we give an example non-trivial application of the sum-of-squares hierarchy: an algorithm for finding planted sparse vectors inside random subspaces. We will give no historical context, but in the final section there will be pointers to other resources which give a better sense of the history and alternative expositions of the same content.
Full post →

## Simple Lower Bounds for Small-bias Spaces

By
— post as [PDF]

I was reading about PRGs recently, and I think a lemma mentioned last time (used for Johnson-Lindenstrauss lower-bounds) can give simple lower-bounds for $\epsilon$-biased spaces.

Notice:

• $2^n$ mutually orthogonal vectors requires dimension at least $2^n$, but $2^n$ “almost orthogonal” vectors with pairwise inner-products $|\innp{v_i, v_j}| \leq \epsilon$ exists in dimension $O(n/\epsilon^2)$, by Johnson-Lindenstrauss.
• Sampling $n$ iid uniform bits requires a sample space of size $2^n$, but $n$ $\epsilon$-biased bits can be sampled from a space of size $O(n/\epsilon^2)$.

First, let's look at $k$-wise independent sample spaces, and see how the lower-bounds might be extended to the almost $k$-wise independent case.

Note: To skip the background, just see Lemma 1, and its application in Claim 3.
Full post →

## Fast Johnson-Lindenstrauss

By
— post as [PDF]

The Johnson-Lindenstrauss (JL) Transform says that, informally, we can embed high-dimensional points into a much lower dimension, while still preserving their pairwise distances. In this post we'll start with the classical JL transform, then focus on the Fast JL Transform (FJLT) by Ailon and Chazelle [AC09], which achieves the JL embedding more efficiently (w.r.t. runtime and randomness). We'll look at the FJLT from the perspective of “preconditioning” a sparse estimator, which comes with nice intuition from Fourier duality. We conclude by mentioning more recent developments in this area (faster/sparser/derandomizeder).
Full post →