Tools for Reproducibility in Data Intensive Social Science Projects

登録は簡単!. 無料です
または 登録 あなたのEメールアドレスで登録
Tools for Reproducibility in Data Intensive Social Science Projects により Mind Map: Tools for Reproducibility in Data Intensive Social Science Projects

1. Part I

1.1. I.1. Creating accounts

1.1.1. The place in the cloud where we will keep our work organized

1.1.2. The one in charge of our bibliography

1.1.3. Easy way to use Python (for free)

1.1.3.1. Meet a NoteBook

1.1.4. Easy way to use R (free but limited)

1.1.5. This one will give you a DOI !

1.2. I.2. Installations

1.2.1. Anaconda

1.2.1.1. Install Jupyter

1.2.1.1.1. Create Environment

1.2.2. R

1.2.3. RStudio

1.2.4. GitHub Desktop

1.2.5. Git Large File Storage

1.2.6. Zotero Desktop

1.2.7. LATEX

2. Part II

2.1. II.1. Collecting data

2.1.1. Online Surveys and APIs

2.1.1.1. Complete the survey!

2.1.1.1.1. Example in Python

2.1.1.2. Take a look at this API

2.1.1.2.1. Example in Python

2.1.2. Scraping html

2.1.2.1. Example in Python

2.2. II.2. Storing

2.2.1. 1. Create a REPO in the **GitHub cloud**; give it a simple name.

2.2.1.1. 1.Use the "new" button

2.2.1.2. 2. Complete the info requested

2.2.2. 2. **Clone** the REPO created into your local computer; this creates a local folder.

2.2.2.1. 1. Open the GitHub desktop and **login**.

2.2.2.2. 2. Go back to the REPO just created, and clone it like this:

2.2.2.3. 3. ACCEPT if this message appears:

2.2.3. 3. Create a **local** folder named **DataFiles** inside the REPO in you local machine.

2.2.4. 4. Open **Github Desktop**. It all seems unchanged

2.2.4.1. like this:

2.2.5. 5. Download the previous codes on WARS into the local REPO (outside the *DataFiles* folder).

2.2.5.1. Downloading from COLAB

2.2.6. 6. **Check** the GitHub desktop. Commit and push (the files downloaded). See the result in the GitHub cloud.

2.2.7. 7. Open the Programs in your computer

2.2.7.1. Open ANACONDA (use **Navigator**)

2.2.7.1.1. Go to your ENVIRONMENT

2.2.7.1.2. Open JUPYTER

2.2.7.1.3. Use JUPYTER to navigate to your local FOLDER (where the codes are).

2.2.8. 8. Run the codes to save the output files.

2.2.8.1. In Python

2.2.9. 9. Commit and Push (the updated code and the files with the dataframe)

2.3. II.3. Accesing files

2.3.1. Locally

2.3.1.1. In Python

3. Part III

3.1. III.1. Data Pre Processing in Python

3.1.1. Cleaning

3.1.1.1. _Regex_ approach

3.1.1.2. _Divide and Conquer _

3.1.2. Formatting

3.1.2.1. Numeric

3.1.2.2. String

3.1.2.3. Categorical

3.1.2.4. Date

3.1.2.5. Geometry

3.1.3. Integration

3.1.3.1. Appending

3.1.3.2. Merging

3.1.4. Transforming

3.1.4.1. Aggregating

3.1.4.2. Reshaping

3.2. III.2. Data Exploration in R

3.2.1. on GitHub

3.2.1.1. Go to the the previous REPO where you have the DataFiles folder

3.2.1.2. Find the file

3.2.1.3. Click on the file name

3.2.1.4. Get the link to the FILE

3.2.1.4.1. using RAW

3.2.1.4.2. using DOWNLOAD

3.2.1.5. Use the link as it were a file location

3.2.2. Open RStudio in your local computer

3.2.2.1. or Open RStudioCloud!

3.2.3. Exploring Table

3.2.4. Exploring Text

3.2.5. Exploring Map

4. Part IV

4.1. IV.1. Preparation

4.1.1. Bibliography (examples)

4.1.1.1. https://www.worldcat.org/title/exploratory-data-analysis/oclc/3058187

4.1.1.2. https://www.amazon.com/-/es/Adam-G-Petrie/dp/1516554280/ref=sr_1_8?__mk_es_US=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=21T3RBYDX7612&keywords=introduction+to+regression&qid=1660835038&sprefix=introduction+to+regression%2Caps%2C225&sr=8-8

4.1.1.3. https://www.amazon.com/-/es/Chris-Brunsdon/dp/1446272958/ref=sr_1_10?__mk_es_US=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=14AGM5SZZ89MA&keywords=gis+in+R&qid=1660966114&s=books&sprefix=gis+in+r%2Cstripbooks%2C239&sr=1-10

4.1.1.4. https://guylipman.medium.com/the-art-of-wordle-f861204a1f99

4.1.1.5. pdf1

4.1.1.6. pdf2

4.1.2. citation files

4.1.3. Repository

4.1.4. DOI for Repository

4.2. IV.2. Publishing

4.2.1. Web

4.2.2. Printout