|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.cs.sb.core.DataSetCore
public class DataSetCore
The class encapsulates a set of gene expression data
Field Summary | |
---|---|
boolean |
badd0
True if a column of inital 0's should be added to the data file, false otherwise |
boolean |
bfullrepeat
If repeat data comes from distinct time series (true) that is a longitudal time series, otherwise it is false and each column of the same time point between data sets is interchangeable |
boolean |
bmaxminval
True if gene change threshold for filtering is based on the max-min difference False if the gene change threshold for filtering is based on the absolute difference |
boolean |
bspotincluded
True if the spot column was included in the data file |
boolean |
btakelog
If the log-ratio taken (true); else data is already in log space (false) |
double[][] |
data
The expression data, row are genes, columns are time points in experiments |
double |
dmincorrelation
Minimum average pairwise correlation a gene must have between full repeats if bfullrepeat is true |
java.lang.String[] |
dsamplemins
The time points at which the expression data was sampled |
double |
dthresholdvalue
The threshold value for required change |
java.lang.String[] |
genenames
The list of gene names for the current data set |
double[][][][] |
generepeatspottimedata
Present/missing data for several data sets First dimension is gene Second dimension is repeat Third dimension is spot Fourth dimension is expression value (in log ratio form against time zero) |
int[][][][] |
generepeatspottimepma
Present/missing data for several data sets First dimension is gene Second dimension is repeat Third dimension is spot Fourth dimension is present/missing value |
double[][][] |
genespottimedata
Present/missing data for one data set First dimension is gene Second dimension is spot Third is expression value (in log ratio form against time zero) |
int[][][] |
genespottimepma
Present/missing data for one data set First dimension is gene Second dimension is spot Third is present/missing value |
java.util.HashMap |
htFiltered
Contains genes filtered. |
int |
nmaxmissing
The maximum number of missing values to prevent a gene from being filtered. |
int |
numcols
Number of columns in the data matrix. |
int |
numrows
Number of rows in the data matrix. |
java.lang.String[] |
otherInputFiles
The names of the other repeat files |
int[][] |
pmavalues
0 if data value is missing non-zero if present |
java.lang.String[] |
probenames
The list of probe IDs in the current data set |
double[] |
sortedcorrvals
The distribution of all the average pairwise correlations of genes across full repeats |
java.lang.String |
szGeneHeader
The header string for the gene name column |
java.lang.String |
szInputFile
The designated main data file associated with the set, others are repeats |
java.lang.String |
szProbeHeader
The header string for the spot ID column |
Constructor Summary | |
---|---|
DataSetCore()
Empty constructor |
|
DataSetCore(DataSetCore theDataSetCore)
Constructor copies each field |
Method Summary | |
---|---|
void |
addExtraToFilter(GoAnnotations tga)
Add those genes from tga.extragenes that were filtered to thtFiltered |
DataSetCore |
averageAndFilterDuplicates()
Removes duplicate gene rows in the data file and combines there values using the median |
protected void |
dataSetReader(java.lang.String szInputFile,
int nmaxmissing,
double dthresholdvalue,
double dmincorrelation,
boolean btakelog,
boolean bspotincluded,
boolean brepeatset,
boolean badd0)
Reads in the datafile stored in szInputFile |
DataSetCore |
filterdistprofiles(DataSetCore theDataSet1,
DataSetCore[] RepeatSet)
Computes the average pairwise correlation between gene repeats stores it in sortedcorrvals. |
DataSetCore |
filterDuplicates()
Removes those rows which are the duplicate of another row. |
DataSetCore |
filtergenesgeneral(boolean[] keepgene,
int nkeep,
boolean bstore)
Filters those rows which do not have a true in keepgene nkeep is the number of true rows in keepgene If bstore is true and gene is filtered then we stroe the gene and proble list for it in htFiltered Returns a new DataSetCore object with those rows filtered |
DataSetCore |
filtergenesthreshold1point()
Filters those genes with expression below dthresholdvalue |
DataSetCore |
filtergenesthreshold2()
If bmaxminval is true, then filter those genes for which the difference between the max and min value is less than dthresholdvalue If bmaxminval if false, then filter those genes for which the absolute expression change is less than dmaxval |
DataSetCore |
filterMissing()
Filter those rows which have nmaxmissing or more missing values or are missing the first time point value |
DataSetCore |
filterMissing1point()
Filters those rows that have a missing value at the first time point |
DataSetCore |
logratio2()
Converts data into log-ratio versus the first time point. |
DataSetCore |
mergeDataSets(DataSetCore[] otherDataSets)
Given an array of otherDataSets, merges it with the current data set by storing in data the median of the values. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public java.lang.String szInputFile
public boolean bfullrepeat
public java.lang.String[] otherInputFiles
public java.util.HashMap htFiltered
public int nmaxmissing
public double dmincorrelation
public int numrows
public int numcols
public double[][] data
public int[][] pmavalues
public boolean bspotincluded
public int[][][] genespottimepma
public int[][][][] generepeatspottimepma
public double[][][] genespottimedata
public double[][][][] generepeatspottimedata
public boolean btakelog
public java.lang.String[] probenames
public java.lang.String[] genenames
public double[] sortedcorrvals
public java.lang.String[] dsamplemins
public double dthresholdvalue
public boolean bmaxminval
public boolean badd0
public java.lang.String szProbeHeader
public java.lang.String szGeneHeader
Constructor Detail |
---|
public DataSetCore()
public DataSetCore(DataSetCore theDataSetCore)
Method Detail |
---|
public void addExtraToFilter(GoAnnotations tga)
protected void dataSetReader(java.lang.String szInputFile, int nmaxmissing, double dthresholdvalue, double dmincorrelation, boolean btakelog, boolean bspotincluded, boolean brepeatset, boolean badd0) throws java.io.IOException, java.io.FileNotFoundException, java.lang.IllegalArgumentException
java.io.IOException
java.io.FileNotFoundException
java.lang.IllegalArgumentException
public DataSetCore filterMissing1point()
public DataSetCore filterMissing()
public DataSetCore filterDuplicates()
public DataSetCore averageAndFilterDuplicates()
public DataSetCore logratio2()
public DataSetCore mergeDataSets(DataSetCore[] otherDataSets)
public DataSetCore filterdistprofiles(DataSetCore theDataSet1, DataSetCore[] RepeatSet)
public DataSetCore filtergenesgeneral(boolean[] keepgene, int nkeep, boolean bstore)
public DataSetCore filtergenesthreshold2()
public DataSetCore filtergenesthreshold1point()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |