By Karl Rohe, Sourav Chatterjee and Bin Yu University of …

Submitted to the Annals of Statistics SPECTRAL CLUSTERING AND THE HIGH-DIMENSIONAL STOCHASTIC BLOCKMODEL By Karl Rohe, Sourav Chatterjee and Bin …

University of California, Berkeley

This is a research paper by Bin Yu, a professor of statistics and electrical engineering at UC Berkeley, on the topic of network tomography. The paper introduces the basic …

Embracing Statistical Challenges in the Information …

ulate more exchanges in the statistics community as well as among the different disciplines so that statistics as a field adequately positions itself in the future development of cyberin-frastructure. The rest of the paper is organized as follows. Section 2 describes three areas of science where massive data sets arise.

THE COMPOSITE ABSOLUTE PENALTIES FAMILY FOR …

Submitted to the Annals of Statistics THE COMPOSITE ABSOLUTE PENALTIES FAMILY FOR GROUPED AND HIERARCHICAL VARIABLE SELECTION By Peng Zhao and Guilherme Rocha and Bin Yu† University of California at Berkeley Extracting useful information from high-dimensional data is an important focus of today's statistical …

Bin Yu

Welcome. I'm Bin Yu, the head of the Yu Group at Berkeley, which consists of 15-20 students and postdocs from statistics and EECS. I was formally trained as a statistician, but my research interests and achievements extend beyond the realm of statistics. Together with my group, my work has leveraged new computational developments to solve ...

Embracing Statistical Challenges in the Information …

statistics because the aim of statistics is to extract useful information from data. Statistics has been undergoing major changes over the last few decades, stimulated primarily by the advances in computing. There has been considerable discussion and introspection within the statistics community about the future of the discipline.

University of California, Berkeley

University of California, Berkeley

Multi-task Sparse Discriminant Analysis (MtSDA) with …

Berkeley, CA 94720 [email protected] Abstract Multi-task learning aims at combining information across tasks to boost prediction performance, especially when the number of training samples is small and the number of predic-tors is very large. In this paper, we first extend the Sparse Dis-

Cloud Detection over Snow and Ice Using MISR Data

1 Introduction Clouds play a major role in the Earth's climate because of their ubiquitous presence and their ability to interact with Sun- (i.e., solar) and Earth- (i.e., terrestrial) generated radiation.

Embracing Statistical Challenges in the Information …

from data, interpreted in a broad sense. Statistics as a discipline has its primary role as assisting this knowledge/information acquisition process in a principled and scienti c manner. IT data are massive or high-dimensional, no matter whether the forms are old (numeric) or new (text, images, videos, sound, and multi-media).

Bin Yu

Departments of Statistics and Electrical Engineering and Computer Sciences ... 367 Evans Hall #3860 • Berkeley, CA 94720. phone: 510-642-2781 • fax: 510-642-7892 • [email protected]. Welcome. I'm Bin Yu, the head of the Yu Group at ... UC Berkeley to lead $10M NSF/Simons Foundation program to investigate theoretical underpinnings …

Stability

Stability Bernoulli 19(4), 2013, 1484–1500 DOI: 10.3150/13-BEJSP14 Stability BIN YU Departments of Statistics and EECS, University of California at Berkeley, Berkeley, …

6976 IEEE TRANSACTIONS ON INFORMATION THEORY, …

Garvesh Raskutti, Martin J. Wainwright, Senior Member, IEEE,and BinYu, Fellow, IEEE Abstract—Consider the high-dimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for esti-

stat.berkeley.edu

stat.berkeley.edu

Spectral clustering and the high-dimensional Stochastic …

791.pdf. Main menu. About. About Berkeley Statistics; History. ... Berkeley Statistics Annual Research Symposium (BSTARS) Seminars. Overview; Neyman Seminar; ...

Statistics at UC Berkeley | Department of Statistics

Learn how to use the L1 norm for robust estimation and sparse recovery in this paper by Aenlle-Rocha and Yu, two professors from Berkeley's statistics department.

Spectral clustering and the high-dimensional Stochastic …

Department of Statistics University of California Berkeley, CA 94720, USA e-mail: [email protected] [email protected] [email protected]

Statistics at UC Berkeley | Department of Statistics

Statistics at UC Berkeley | Department of Statistics

Bin Yu

Welcome I'm Bin Yu, the head of the Yu Group at Berkeley, which consists of 15-20 students and postdocs from Statistics and EECS. I was formally trained as a statistician, …

University of California, Berkeley

Created Date: 2/16/2010 1:33:21 PM

RASKUTTI, WAINWRIGHT AND YU our three main theorems, with the more technical details deferred to the Appendices. We conclude with a discussion in Section 5. 2. Background and Prob

University of California, Berkeley

Journal of Machine Learning Research 18 (2018) 1-56 Submitted 3/16; Revised 7/17; Published 4/18 Local Identi ability of ' 1-minimization Dictionary Learning: a Su cient and Alm

Predicting Execution Time of Computer Programs …

[email protected] Bin Yu UC Berkeley [email protected] Byung-Gon Chun Intel Labs Berkeley byung-gon.chun@intel Petros Maniatis Intel Labs Berkeley petros.maniatis@intel Mayur Naik Intel Labs Berkeley mayur.naik@intel Abstract Predicting the execution time of computer programs is an important but challeng-

UCB Statistics Department Faculty

Statistics Department Faculty DAVID ALDOUS Interests: Markov chains, rare events, analysis of algorithms, probabilistic combinatorics, discrete statistical physics, miscellaneous applied probability Office: 351 Evans Email: [email protected] (Joint with Mathematics) RUDOLPH BERAN Interests: Experimental statistics, superefficient …

Embracing Statistical Challenges in the Information …

technologies have to be integrated into statistics, and statistical thinking in turn must be integrated into computer technologies. 1. INTRODUCTION "Information technology (IT) is a broad subject concerned with technology and other aspects of managing and processing information, especially in large or-ganizations.

Efficient algorithms for discrete universal denoising …

Tampere University of Technology P.O. Box 553, FIN-33101 Tampere, Finland Email: ciprian.giurcaneanu@tut.fi Bin Yu Department of Statistics University of California at Berkeley Berkeley, CA 94720-3860 USA Email: [email protected] Abstract—The paper is focused on the problem of discrete universal denoising: one estimates the input ...

Discovering Word Associations in News Media via …

Berkeley, CA 94720 [email protected] Sophie Clavier Dept. of International Relations San Francisco State University San Francisco, CA 94132 [email protected] ABSTRACT We analyze the image" of a given query word in a given corpus of text news by producing a short list of other words

Various choices of the loss function are possible, including (a) the model selection loss, which is zero if supp(β&)=supp(β∗) and one otherwise; (b) the #p-losses Lp(β&,β∗

Information In The Non-Stationary Case

fvqv, [email protected], [email protected] y Department of Statistics, University of California, Berkeley z Department of Statistics and Center for the Neural Basis of Cognition, Carnegie Mellon University July 18, 2008 Abstract Information estimates such as the direct method" of Strong et al. (1998) sidestep

Codes and Models Bin Yu

A personal and biased list: Kullback ('51) Mutual information and sufficiency Jaynes ('57) Maximum entropy method Kullback and Leibler ('59) KL divergence

Local Identi ability of 1-minimization Dictionary …

Siqi Wu [email protected] Bin Yu [email protected] Department of Statistics University of California Berkeley, CA 94720-1776, USA Editor: Hui Zou Abstract We study the theoretical properties of learning a dictionary from N signals x i 2RK for i= 1;:::;N via ' 1-minimization. We assume that x i's are i:i:d:random linear combina-

Embracing Statistical Challenges in the Information …

government, industry, university, individual alike are using computer technology to create a gigantic amount of text in various file formats (e.g. doc, txt, pdf, ps) and in all human languages (currently in use or endangered or long-dead). Images and videos are much larger than text files.

High-dimensionalcovarianceestimation byminimizing ℓ

Berkeley, CA 94720-1776 USA e-mail: [email protected] [email protected] [email protected] [email protected] Abstract: Given i.i.d. observations of a random vector X ∈ Rp, we study the problem of estimating both its covariance matrix Σ∗, and its inverse covariance or concentration …

Embracing Statistical Challenges in the Information …

In this paper we showcase the IT challenges and opportunities for statistics. Diverse IT areas are reviewed, and we share our IT research experience by giving examples and covering two projects of theauthor and co-workersto demonstratethe breath and varietyofof IT problemsand to ground the paper. The material covered re