New Twitter Infrastructure
by Michael Pereira
1. Code
1.1. Producer
1.1.1. Manage Twitter Stream (connection, error logging, ...)
1.1.2. Manage RabbitMQ (connection, error logging)
1.1.3. Transfer serialized tweets to the queue
1.1.4. Test suite
1.1.5. Documentation
1.2. Consumer
1.2.1. Manage RabbitMQ (connection, error logging)
1.2.2. Manage DB-agnostic layer (connection, error logging)
1.2.3. Deserialize and process tweets
1.2.4. Store tweet to DB-agnostic interface
1.2.4.1. MySQL implementation
1.2.4.2. NoSQL implementation
1.2.5. Test suite
1.2.6. Documentation
2. Continuous Integration
2.1. Jenkins
2.1.1. Installation
2.1.2. Builds setup
3. Prerequisites
3.1. Environment setup
3.1.1. RabbitMQ
3.1.2. WAMP
3.1.3. Twitter account and api key
3.1.4. Git/GitHub
3.2. Backup MySQL (185G)
3.3. RabbitMQ learning
3.4. Study NoSQL alternatives
3.4.1. Cassandra
3.4.2. MongoDB
3.4.3. Voldemort
4. Backup
4.1. Create regular data Backup from the Database
4.2. How often ?
4.3. Where to store it ?
4.4. Which format ?
5. Server Configuration
5.1. Launch at startup
5.1.1. RabbitMQ
5.1.2. Artifactory
5.1.3. MySQL