**cellHTS2** provides different options for handle =
and normalize of high-throughput screening data. We here provide a short ov=
erview of normalization strategies.

=20

=20
- =20
- Introduction=20
- Median normalization (sample-based normalization) =20
- Shorth normalization (sample-based normalization) =20
- Mean normalization (sample-based normalization) =20
- Normalization on Negative Controls (control-base= d normalization) =20
- Percent of Control (control-based normalization) =20
- Normalized Percent of Control (control-based normaliz= ation) =20
- B-Score Norm= alization =20
- Loess Regression and Robust Local Fit Regression =20
- Refer= ences and further reading =20

cellHTS2 implements a number of different normalization options to scale= plate-to-plate differences in an experiments. In general, one distinguishe= d between control-based and sample-based normalization methods that have bo= th advantages and disadvantages. Sample-based normalization strategies are = mostly used if it assumed that the number of "hits" in a plate is rather lo= w, and fails to be robust if many wells show phenotypic changes. This is pa= rticularly important to consider, if RNAi (or compound) reagents have a non= -random distribution across the experiments, such as the case of several si= RNA libraries and in experiments that are designed to retest previously ide= ntified "hits". Control-based normalization methods can avoid such pitfalls= , however, since the number of controls per plate is usually limited, are p= rone to variations in the control wells. Spatial effects can be corrected b= y B-score and Loess transformation.

=20In general, the choice of normalization method is very much dependent on= the experiment design and the quality of the experimental results. There i= s no "general" recommendation that can be given for all experiments, but we= usually advice to start with simple (e.g. median) normalization methods, a= ssess the data by plate plots and check whether spatial normalization might= be necessary.

=20Plate median normalization scales plates based on calculating the relati= ve signal of each well compared to the median of all sample wells in the pl= ate. The median is calculated of the sample wells (e.g. for all wells that = contain reagents that target genes of interest) in a given result file. In = many cases, plate normalization is the preferred (and least stringent) norm= alization option but should be avoided if there are spatial effects on plat= es in the HTS data set.

=20Shorth normalization is a variant of the plate median normalization whic= h consist of using the midpoint of the shorth of the per-plate distribution= of values on sample wells. This is for example appropriate, if distributio= n of sample values due to non-random distribution of reagents throughout an= experiment has multiple peaks.

=20Mean normalization is a variant of the plate median scaling which divide= s each sample value by the per-plate average. This normalization is less ro= bust against outliers than Median normalization.

=20This method consists of scaling the sample measurements by the per-plate= median of the values of the wells that have been annotated in the plate co= nfiguration file as "negative controls". This method is particular appropri= ate if many sample values are likely giving an effect. It should also be no= ted that the number of negative controls per assay plate should be sufficie= ntly high, otherwise even small differences in negative control values migh= t lead to significant shift in the reported results.

=20Same as above, except that wells annotated as "positive controls" are us= ed as a reference.

=20Normalization methods also known as "normalized percentage inhibition" (= NPI) that relies on calculating a well result by dividing the difference be= tween sample measurements and the average of positive controls through the = difference between positive and negative controls.

=20B-score normalization can remove row and column biases within each plate= by fitting a two-way median polish to the raw data in a per plate manner. = This method is particularly useful if row or column effect in plates are ob= served (e.g. pipetting differences or evaporation).

=20cellHTS2 provides additional spatial normalization methods that fit a po= lynomial surface to the intensities within each assay plate using local reg= ression and that can be performed via normalizePlates or spatialNormalizati= on functions, although we advise to apply these methods using the former fu= nction. The fit can be performed either using the loess procedure or the lo= cfit.robust function of package locfit. In normalizePlates, if method=3D"lo= cfit", spatial effects are removed by fitting a bivariate local regression = to each plate and replicate, while if method=3D"loess", a loess curve is fi= tted instead.

=201. Bioconductor/R cellHTS2 description [link]

=202. Boutros M, Br=C3=A1s LP, Huber W. (2006). Analysis of cell-based RNAi= screens. Genome Biol. 7:R66. [link]

=203. Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R. (2006). Statist= ical practice in high-throughput screening data analysis. Nat Biotechnol. 2= 4:167-75. [link]

=204. Birmingham A, Selfors LM, Forster T, Wrobel D, Kennedy CJ, Shanks E, = Santoyo-Lopez J, Dunican DJ, Long A, Kelleher D, Smith Q, Beijersbergen RL,= Ghazal P, Shamu CE. (2009). Statistical methods for analysis of high-throu= ghput RNA interference screens. Nat Methods 6:569-75. [link]