SDRF to MaxQuant analysis

Example for re-analysing sdrf-annotated data with MaxQuant

Here, we show how to re-analyse the proteomic standard data set (publication) using the annotations in the SDRF-file. The procedure can easily adapt to other datasets.

We used the following versionf of sdrf-pipelines: sdrf-pipelines (0.0.14) and [MaxQuant] (https://www.maxquant.org/) (1.6.10.43) We recommend using Conda for the installation.

Data download

You need to download the SDRF file, a database that contains the yeast proteome and the UPS proteins (e.g. this one, and the raw data files from PRIDE.

Create MaxQuant parameter file

The following command adds the experimental design, file paths and available search parameters in the sdrf-file to a MaxQuant parameter file with default settings.

parse_sdrf convert-maxquant -s sdrf.tsv -f $PWD/yeast_UPS.fasta -r PATH_TO_RAW_FILES

Here, we assume that the files sdrf.tsv and yeast_UPS.fasta are located in the current folder. Do not forget to change PATH_TO_RAW_FILES accordingly.

Important: Always use absolute paths for the fasta file and the folder with the raw files, as MaxQuant can have issues with relative paths. You might need to change the $PWD function if you are in a Windows or a Mac environment. You will get a MaxQuant parameter file named mqpar.xml

The resulted maqpar.xml starts with the following lines:

<?xml version="1.0" encoding="utf-8"?>
<MaxQuantParams xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<fastaFiles>
		<FastaFileInfo>
			<fastaFilePath>/home/veit/Test_sdrf_MQ/yeast_UPS.fasta</fastaFilePath>
			<identifierParseRule>&gt;([^\s]*)</identifierParseRule>
			<descriptionParseRule>&gt;(.*)</descriptionParseRule>
			<taxonomyParseRule></taxonomyParseRule>
			<variationParseRule></variationParseRule>
			<modificationParseRule></modificationParseRule>
			<taxonomyId></taxonomyId>
		</FastaFileInfo>
	</fastaFiles>
	<fastaFilesProteogenomics></fastaFilesProteogenomics>
	<fastaFilesFirstSearch></fastaFilesFirstSearch>
	<fixedSearchFolder></fixedSearchFolder>
	<andromedaCacheSize>350000</andromedaCacheSize>
	<advancedRatios>True</advancedRatios>
	<pvalThres>0.005</pvalThres>
	<neucodeRatioBasedQuantification>False</neucodeRatioBasedQuantification>
	<neucodeStabilizeLargeRatios>False</neucodeStabilizeLargeRatios>
	<rtShift>False</rtShift>
	<separateLfq>False</separateLfq>
	<lfqStabilizeLargeRatios>True</lfqStabilizeLargeRatios>

The mqpar.xml for the UPS example can be found here.

Note: Check the description of the sdrf-pipelines for further option like setting the temporary folder or the number of threads to accelerate the MaxQuant analysis

Run MaxQuant

The standard command-line procedure is:

maxquant mqpar.xml

Running the full UPS data set will take a while (hours to a day) depending on the computer. You will find the output files in a subfolder combined in the given directory of the raw files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDRF to MaxQuant analysis

Example for re-analysing sdrf-annotated data with MaxQuant

Data download

Create MaxQuant parameter file

Run MaxQuant

Clone this wiki locally