org.fhcrc.cpl.viewer.amt
Class AmtDatabaseManager

java.lang.Object
  extended by org.fhcrc.cpl.viewer.amt.AmtDatabaseManager

public class AmtDatabaseManager
extends java.lang.Object

Manages AMT databases -- prunes outliers of various kinds


Field Summary
static int DEFAULT_MATCHING_DEGREE_FOR_DB_ALIGNMENT
           
static int DEFAULT_MIN_OBSERVATIONS_FOR_ALIGNMENT_REGRESSION
           
static int DEFAULT_MIN_PEPTIDES_FOR_ALIGNMENT_REGRESSION
           
 
Constructor Summary
AmtDatabaseManager()
           
 
Method Summary
protected static void addRun(AmtDatabase newDatabase, AmtDatabase sourceDatabase, AmtRunEntry runToAdd)
           
static AmtDatabase adjustEntriesForAcrylamide(AmtDatabase amtDB, boolean fromAcrylamideToNot, boolean showCharts)
          Adjust AMT database entries to account for the effect of acrylamide
static AmtDatabase alignAllRunsUsingCommonPeptides(AmtDatabase amtDB, int minMatchedPeptides, int matchingdegree, boolean showCharts)
           
protected static double[] alignRun(AmtDatabase newDatabase, AmtDatabase sourceDatabase, AmtRunEntry runToAdd, java.util.Set<java.lang.String> peptideOverlap, int matchingDegree, boolean showCharts)
           
protected static java.util.Set<java.lang.String> createPeptideSetFromFeatures(Feature[] features)
           
static float[][] findMinAndMaxMassesForMatch(Feature[] features, float massMatchDeltaMass, int massMatchDeltaMassType)
          This is a utility method to determine minimum and maximum masses for matching individual features by mass.
protected static int findSetWithMostOverlap(java.util.Set<java.lang.String> baseSet, java.util.Set<java.lang.String>[] candidateSets, java.util.List<java.lang.Integer> setsToIgnore)
          Given a set of Strings and an array of candidate sets, find the candidate set with the most overlap with the base.
protected static java.util.Set<java.lang.String> intersection(java.util.Set<java.lang.String> overlapSet1, java.util.Set<java.lang.String> overlapSet2)
           
protected static AmtPeptideEntry.AmtPeptideObservation[] pickObservationsWithLowLeverage(AmtPeptideEntry.AmtPeptideObservation[] observations)
          Pick obserations with a low leverage value.
static AmtPeptideEntry.AmtPeptideObservation[] removeHydrophobicityOutliers(AmtDatabase amtDB, double hydroStdDevMultipleCutoff)
          Throw out observations that are above a given multiple of the standard deviation of all observations for that peptide.
static void removePredictedHOutliers(AmtDatabase amtDB, float significantHydroDifference)
          Remove entries with only one observation, where that observation differs from predicted H by more than significantHydroDifference
static AmtDatabase removeRunsByStructure(AmtDatabase amtDatabase, int ms1RunCol, int ms1RunRow, int ms1RunExp, int maxNewDBEntries, int maxNewDBRuns, Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure, boolean showCharts)
           
static AmtDatabase removeRunsInOrder(AmtDatabase amtDatabase, int maxNewDBEntries, int maxNewDBRuns, Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure, AmtRunEntry[] runEntriesSorted, boolean showCharts, Feature[] ms1Features, MS2Modification[] ms2ModificationsForMatching, Feature[] ms2Features, AmtDatabaseMatcher databaseMatcher)
          Trims down an AMT database by adding runs one by one from runEntriesSorted, stopping when adding another run would violate maxNewDBEntries or maxNewDBRuns.
static AmtDatabase removeRunsWithoutMassMatches(AmtDatabase amtDatabase, Feature[] ms1FeaturesOrig, int minMassMatchPercent, float massMatchDeltaMass, int massMatchDeltaMassType, int maxNewDBEntries, int maxNewDBRuns, MS2Modification[] modificationsInMS1, boolean showCharts)
          Return a database containing only entries from the subset of runs from the passed-in AMT database that match, by mass only, at least minMassMatchPercent of their entries to ms1Features.
static AmtDatabase removeRunsWithoutPeptideMatches(AmtDatabase amtDatabase, Feature[] ms2Features, int minPeptideMatchPercent, int maxNewDBEntries, int maxNewDBRuns, Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure, boolean showCharts, Feature[] ms1Features, MS2Modification[] ms2ModificationsForMatching, AmtDatabaseMatcher dbMatcher)
          Remove runs with insufficient peptide matches to the MS/MS features passed in
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_MIN_PEPTIDES_FOR_ALIGNMENT_REGRESSION

public static final int DEFAULT_MIN_PEPTIDES_FOR_ALIGNMENT_REGRESSION
See Also:
Constant Field Values

DEFAULT_MIN_OBSERVATIONS_FOR_ALIGNMENT_REGRESSION

public static final int DEFAULT_MIN_OBSERVATIONS_FOR_ALIGNMENT_REGRESSION
See Also:
Constant Field Values

DEFAULT_MATCHING_DEGREE_FOR_DB_ALIGNMENT

public static final int DEFAULT_MATCHING_DEGREE_FOR_DB_ALIGNMENT
See Also:
Constant Field Values
Constructor Detail

AmtDatabaseManager

public AmtDatabaseManager()
Method Detail

removePredictedHOutliers

public static void removePredictedHOutliers(AmtDatabase amtDB,
                                            float significantHydroDifference)
Remove entries with only one observation, where that observation differs from predicted H by more than significantHydroDifference

Parameters:
amtDB -
significantHydroDifference -

adjustEntriesForAcrylamide

public static AmtDatabase adjustEntriesForAcrylamide(AmtDatabase amtDB,
                                                     boolean fromAcrylamideToNot,
                                                     boolean showCharts)
Adjust AMT database entries to account for the effect of acrylamide

Parameters:
amtDB - Altered in place
fromAcrylamideToNot - If true, adjust acrylamide-containing entries to remove acrylamide's effect. If false, adjust non-acrylamide-containing entries to add the effect of acrylamide
Returns:

intersection

protected static java.util.Set<java.lang.String> intersection(java.util.Set<java.lang.String> overlapSet1,
                                                              java.util.Set<java.lang.String> overlapSet2)

alignAllRunsUsingCommonPeptides

public static AmtDatabase alignAllRunsUsingCommonPeptides(AmtDatabase amtDB,
                                                          int minMatchedPeptides,
                                                          int matchingdegree,
                                                          boolean showCharts)
Parameters:
amtDB -
minMatchedPeptides -

alignRun

protected static double[] alignRun(AmtDatabase newDatabase,
                                   AmtDatabase sourceDatabase,
                                   AmtRunEntry runToAdd,
                                   java.util.Set<java.lang.String> peptideOverlap,
                                   int matchingDegree,
                                   boolean showCharts)

addRun

protected static void addRun(AmtDatabase newDatabase,
                             AmtDatabase sourceDatabase,
                             AmtRunEntry runToAdd)

pickObservationsWithLowLeverage

protected static AmtPeptideEntry.AmtPeptideObservation[] pickObservationsWithLowLeverage(AmtPeptideEntry.AmtPeptideObservation[] observations)
Pick obserations with a low leverage value. Since there's a linear map from time to hydrophobicity, it doesn't matter whether we base the leverage on time or hydrophobicity. Using hydrophobicity

Parameters:
observations -
Returns:

findSetWithMostOverlap

protected static int findSetWithMostOverlap(java.util.Set<java.lang.String> baseSet,
                                            java.util.Set<java.lang.String>[] candidateSets,
                                            java.util.List<java.lang.Integer> setsToIgnore)
Given a set of Strings and an array of candidate sets, find the candidate set with the most overlap with the base. Ignore any sets in setsToIgnore This would be trivial if HashSets actually implemented the stuff that they should

Parameters:
baseSet -
candidateSets -
setsToIgnore -
Returns:

removeHydrophobicityOutliers

public static AmtPeptideEntry.AmtPeptideObservation[] removeHydrophobicityOutliers(AmtDatabase amtDB,
                                                                                   double hydroStdDevMultipleCutoff)
Throw out observations that are above a given multiple of the standard deviation of all observations for that peptide. REMEMBER: for low degrees of freedom (few peptide observations), standard deviation gets wonky. You really want to compare against a T value instead. But according to Yan, a cutoff of 3 standard deviations is still pretty darn safe. For 3 peptide observations, it's still equivalent to tossing out <5% of observations. It gets more conservative as you go up in df. So, IDEALLY, the parameter would be percentToThrowAway, rather than hydroStdDevMultipleCutoff, but that's complicated, and this is safe enough.

Parameters:
amtDB -
hydroStdDevMultipleCutoff -
Returns:

findMinAndMaxMassesForMatch

public static float[][] findMinAndMaxMassesForMatch(Feature[] features,
                                                    float massMatchDeltaMass,
                                                    int massMatchDeltaMassType)
This is a utility method to determine minimum and maximum masses for matching individual features by mass. Returns min and then max masses, in an array.

Parameters:
features -
massMatchDeltaMass -
massMatchDeltaMassType -
Returns:

removeRunsWithoutMassMatches

public static AmtDatabase removeRunsWithoutMassMatches(AmtDatabase amtDatabase,
                                                       Feature[] ms1FeaturesOrig,
                                                       int minMassMatchPercent,
                                                       float massMatchDeltaMass,
                                                       int massMatchDeltaMassType,
                                                       int maxNewDBEntries,
                                                       int maxNewDBRuns,
                                                       MS2Modification[] modificationsInMS1,
                                                       boolean showCharts)
Return a database containing only entries from the subset of runs from the passed-in AMT database that match, by mass only, at least minMassMatchPercent of their entries to ms1Features. Add the best runs first, and stop if db gets too big

Parameters:
amtDatabase -
ms1FeaturesOrig -
minMassMatchPercent -
massMatchDeltaMass -
massMatchDeltaMassType -
maxNewDBEntries -
modificationsInMS1 -
showCharts -
Returns:

removeRunsByStructure

public static AmtDatabase removeRunsByStructure(AmtDatabase amtDatabase,
                                                int ms1RunCol,
                                                int ms1RunRow,
                                                int ms1RunExp,
                                                int maxNewDBEntries,
                                                int maxNewDBRuns,
                                                Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure,
                                                boolean showCharts)

createPeptideSetFromFeatures

protected static java.util.Set<java.lang.String> createPeptideSetFromFeatures(Feature[] features)

removeRunsWithoutPeptideMatches

public static AmtDatabase removeRunsWithoutPeptideMatches(AmtDatabase amtDatabase,
                                                          Feature[] ms2Features,
                                                          int minPeptideMatchPercent,
                                                          int maxNewDBEntries,
                                                          int maxNewDBRuns,
                                                          Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure,
                                                          boolean showCharts,
                                                          Feature[] ms1Features,
                                                          MS2Modification[] ms2ModificationsForMatching,
                                                          AmtDatabaseMatcher dbMatcher)
Remove runs with insufficient peptide matches to the MS/MS features passed in

Parameters:
amtDatabase -
ms2Features -
minPeptideMatchPercent -
maxNewDBEntries -
maxNewDBRuns -
showCharts -
Returns:

removeRunsInOrder

public static AmtDatabase removeRunsInOrder(AmtDatabase amtDatabase,
                                            int maxNewDBEntries,
                                            int maxNewDBRuns,
                                            Fractionation2DUtilities.FractionatedAMTDatabaseStructure amtDatabaseStructure,
                                            AmtRunEntry[] runEntriesSorted,
                                            boolean showCharts,
                                            Feature[] ms1Features,
                                            MS2Modification[] ms2ModificationsForMatching,
                                            Feature[] ms2Features,
                                            AmtDatabaseMatcher databaseMatcher)
Trims down an AMT database by adding runs one by one from runEntriesSorted, stopping when adding another run would violate maxNewDBEntries or maxNewDBRuns.

Parameters:
amtDatabase -
maxNewDBEntries -
maxNewDBRuns -
amtDatabaseStructure -
runEntriesSorted - sorted in _descending_ order of goodness
showCharts -
Returns:


Fred Hutchinson Cancer Research Center