1. aesthetics.py
1.1. **boxymcboxface**
1.1.1. Just some aesthetic display
2. **blaster.py**
2.1. **blastn_dic**
2.1.1. It creates a **BLAST database** with the **FASTA** file it reads
2.2. **blastn_blaster**
2.2.1. It executes a **BLASTn** search
2.3. **repetitive_blaster**
3. **identifiers.py**
3.1. **specific_sequence_extractor**
3.1.1. For each **chromosome ID** selected, it will take all the rows in the **CSV** file corresponding to that **chomosome ID**
3.2. **genome_specific_chromosome_main**
3.2.1. Calls all blasters and filters
4. **seq_modifiers.py**
4.1. **specific_sequence_1000nt**
4.1.1. For each **chromosome ID**, it will expand all the selected sequences in the **CSV** to **1000 nt**
4.2. **specific_sequence_corrected**
4.2.1. It reads the **CSV** file and finds the **correct coordinates** of the sequences.
5. **files_manager.py**
5.1. **folder_creator**
5.1.1. Creates a folder in the current path
5.2. **csv_miver**
5.2.1. Mixes 2 CSV files into one
5.3. **csv_creator**
5.3.1. Creates a **CSV** file
5.4. **fasta_creator**
5.4.1. It will create a **FASTA** file with the input **CSV** file
6. **filters.py**
6.1. **chromosome_filter**
6.1.1. It reads a **FASTA** file and numbers all the sequences with a custom made **prefix**
6.1.2. The sequences in fasta file NEED to be in order
6.2. **dash_filter**
6.2.1. Will filter **dashes**, "-", in a **CSV** file
6.3. **filter_by_column**
6.3.1. Filter **CSV** data depending on:
6.3.1.1. Minimun sequence **length**
6.3.1.2. Minimum identity **percent**
6.3.2. Then it Outputs a **CSV** file
6.4. **global_filters_main**
6.4.1. Mixes other filters
6.4.2. The Output is a **CSV** which is constantly overwritten
7. **duplicates.py**
7.1. **genome_pre_duplicate_filter**
7.1.1. Filters the data in a **CSV** for duplicated sequences
7.2. **genome_duplicate_filter**
7.2.1. Calls *genome_pre_duplicate_filter* for "+" and "-" DNA strand and unites the results.
7.2.2. The Output is a **CSV** file with the data filtered.
8. **overlap.py**
8.1. **genome_solap_location_filter**
8.1.1. For each DNA strand we get the **start** and **end** coordinates
8.2. **genome_solap_location_grouping**
8.2.1. For each DNA strand it will group coordinates depending on nearness
8.3. **genome_solap_minmax**
8.3.1. Depending on each DNA strand it will get the *minimum* or *maximum* of each **group**
8.4. **genome_solap_by_pairs**
8.4.1. It will analyze two sequence which are overlaping and join them