Social Network Project
Client is a start-up company with USD 6M funding. The company’s product help creative professional around the world share and get inspiration for their creativity works. The social network site grew to 60K professional and 300K high-quality artworks.
The social network project need to solve following major problems:
- Process a high-quality media which up to 20MB/photo, also video and audio. System have to doing background processing on the uploaded media to make it loading faster on browser
- Efficiently query social activities to display on user’s dashboard who’s following up to 5000 creative. The traditional relational database such like mysql has problem with implement this problem
- Build a recommendation system based on user’s activities and favorites
- Build a high scalability system based on network of AWS services
- Automate the deployment process to maintain different stage of development and testing.
- Mobile applications with limit memory, network must load and render media efficiently and smoothly
- Build a queued jobs based on beanstalkd, a memcached-based system. Its interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously. The job system monitored by a supervisor, a process control system. Supervisor monitored and spawn the background process if its exited or killed by OS for any reason. When media being uploaded into system, a new job created and pushed into the queue and triggered a process to generate thumb, resize, video processing etc using the ffmpeg and uploading to AWS CloundFront. This system increased the responsive time of system for heavy jobs significantly. We also apply this system to other jobs example sending email, generate weekly reports, scheduled clean system etc
- The team had a difficult time with modeling and querying the network data using mysql. Later, we decide to go to another database, a graph-based database system called neo4j. The primary database still mysql, however all the social follower network, social activities being modeling on the graph-database.
- The mobile app utilizes the SD card to cache image and free the cache periodically make it efficiency using the space to improve the loading experience but also not consume too much space on SD cards. The mobile app dashboard can scroll up to 200 images without any lagging.
- Build the system with high-scale mindset from beginning. The team builds a system architecture spitted web server and data server, along with HAProxy load-balancing in place. The code designed to work with multiple data servers and multiple web servers.
- Employed Chef into the dev-ops, plus the Jenkin build system. We also have difference configuration-based environment for development, staging and production. Team can deploy the major build into production or staging in matter of minutes without any manual works.
- Maintain multiple sizes of media, which serve difference media quality based on user’s browser, also the background system removes blocking user from continuing their activities on website. The loading time reduces significantly from 30s to minutes to under 5s.
- With minify system, CDN and the optimized on HTML templates, YSlow Performance score from E to B
- Query activities from 4000 network user down from minutes in mysql to seconds with graph-database. The further optimization reduce the load from 7s to under 2s, and low depend on network size
- System has good scalability. We can increase or reduce resources such as EC2, Amazon RDS etc depend on the load of system in matter of hours, and all can be done via aws command
- Automate the headache process of maintain mode, deploy new build and testing new