BLASTER - Modules

Module Pipeline MindMap for SIDER_RepetitiveSearcher project

Iniziamo. È gratuito!
o registrati con il tuo indirizzo email
BLASTER - Modules da Mind Map: BLASTER - Modules

1. aesthetics.py

1.1. **boxymcboxface**

1.1.1. Just some aesthetic display

2. **blaster.py**

2.1. **blastn_dic**

2.1.1. It creates a **BLAST database** with the **FASTA** file it reads

2.2. **blastn_blaster**

2.2.1. It executes a **BLASTn** search

2.3. **repetitive_blaster**

3. **identifiers.py**

3.1. **specific_sequence_extractor**

3.1.1. For each **chromosome ID** selected, it will take all the rows in the **CSV** file corresponding to that **chomosome ID**

3.2. **genome_specific_chromosome_main**

3.2.1. Calls all blasters and filters

4. **seq_modifiers.py**

4.1. **specific_sequence_1000nt**

4.1.1. For each **chromosome ID**, it will expand all the selected sequences in the **CSV** to **1000 nt**

4.2. **specific_sequence_corrected**

4.2.1. It reads the **CSV** file and finds the **correct coordinates** of the sequences.

5. **files_manager.py**

5.1. **folder_creator**

5.1.1. Creates a folder in the current path

5.2. **csv_miver**

5.2.1. Mixes 2 CSV files into one

5.3. **csv_creator**

5.3.1. Creates a **CSV** file

5.4. **fasta_creator**

5.4.1. It will create a **FASTA** file with the input **CSV** file

6. **filters.py**

6.1. **chromosome_filter**

6.1.1. It reads a **FASTA** file and numbers all the sequences with a custom made **prefix**

6.1.2. The sequences in fasta file NEED to be in order

6.2. **dash_filter**

6.2.1. Will filter **dashes**, "-", in a **CSV** file

6.3. **filter_by_column**

6.3.1. Filter **CSV** data depending on:

6.3.1.1. Minimun sequence **length**

6.3.1.2. Minimum identity **percent**

6.3.2. Then it Outputs a **CSV** file

6.4. **global_filters_main**

6.4.1. Mixes other filters

6.4.2. The Output is a **CSV** which is constantly overwritten

7. **duplicates.py**

7.1. **genome_pre_duplicate_filter**

7.1.1. Filters the data in a **CSV** for duplicated sequences

7.2. **genome_duplicate_filter**

7.2.1. Calls *genome_pre_duplicate_filter* for "+" and "-" DNA strand and unites the results.

7.2.2. The Output is a **CSV** file with the data filtered.

8. **overlap.py**

8.1. **genome_solap_location_filter**

8.1.1. For each DNA strand we get the **start** and **end** coordinates

8.2. **genome_solap_location_grouping**

8.2.1. For each DNA strand it will group coordinates depending on nearness

8.3. **genome_solap_minmax**

8.3.1. Depending on each DNA strand it will get the *minimum* or *maximum* of each **group**

8.4. **genome_solap_by_pairs**

8.4.1. It will analyze two sequence which are overlaping and join them

8.5. **genome_solap_main**