Data#
Table of Contents#
Resources#
Tools & Technologies
[ h ] Apache Airflow
[ h ] Apache Hadoop
[ h ] Apache Kafka
[ h ] Apache Spark
[ h ] Dask
[ h ] Docker
[ h ] Kubeflow
[ h ] Kubernetes
[ h ] MongoDB
[ h ] MySQL
[ h ] Neo4j
[ h ] PostgreSQL
[ h ] Redis
freeCodeCamp
[ y ]
06-16-2021
. “Data Analytics Crash Course: Teach Yourself in 30 Days”.[ y ]
07-02-2018
. “SQL Tutorial - Full Database Course for Beginners”.[ y ]
08-31-2018
. “Database Design Course - Learn how to design and plan a database for beginners”.
Geek’s Lessons
My Lesson
[ y ]
11-17-2022
. “IBM Data Analyst Complete Course | Data Analyst Tutorial For Beginners”.
Nerd’s Lesson
[ y ]
07-10-2023
. “Advanced Data Analytics,Google | Data Analytics”.[ y ]
04-15-2023
. “Database Engineering Complete Course | DBMS Complete Course”. YouTube.
more
[ y ]
03-01-2023
. Stenway. “Stop Using CSV !”.
[ y ]
01-21-2024
ExplainingComputers. “Explaining File Compression Formats”.
online
Texts#
DBs
[ h ] Abiteboul, Serge; Richard Hull; & Victor Vianu. Foundations of Databases.
Campbell, Laine & Charity Majors. (2017). Database Reliability Engineering: Designing and Operating Resilient Database Systems. O’Reilly.
[ h ] Lemahieu, Wilfred, Seppe Vanden Broucke, & Bart Baesens. (2018). Principles of Database Management: The Practical Guide to Storing, Managing, and Analyzing Big and Small Data. Cambridge University Press.
[ h ] Petrov, Alex. (2019). Database Internals: A Deep Dive into How Distributed Data Systems Work. O’Reilly.
[ h ] Ramakrishnan, Raghu & Johannes Gehrke. (2003). Database Management Systems, 3rd Ed. McGraw-Hill.
Data
[ h ] Doan, AnHai; Alon Halevy; & Zachary Ives. (2012). Principles of Data Integration. Morgan Kaufmann.
Data Analytics & Science
[ g ] Buisson, Florent. (2021). Behavioral Data Analysis with R and Python. O’Reilly.
[ G ] Cohen, Mike X. (2022). Practical Linear Algebra for Data Science: From Core Concepts to Applications Using Python. O’Reilly.
[ g ] Densmore, James. (2021). Data Pipelines Pocket Reference: Moving and Processing Data for Analytics. O’Reilly.
Downey, Allen B. (2023). Elements of Data Science. No Starch Press.
Farrelly, Colleen M. & Yae Ulrich Gaba. (2023). The Shape of Data: Network Science, Geometry-Based Machine Learning, and Topological Data Analysis in R. No Starch Press.
[ g ] Fregly, Chris & Antje Barth. (2021). Data Science on AWS: Implementing End-to-End, Continuous AI and Machine Learning Pipelines. O’Reilly.
[ g ] Grus, Joel. (2019). Data Science from Scratch: First Principles with Python. 2nd Ed. O’Reilly.
[ h ][ g ] Janssens, Jeroen. (2021). Data Science at the Command Line: Obtain, Scrub, Explore, and Model Data with Unix Power Tools. 2nd Ed. O’Reilly.
[ g ] Lakshmanan, Valliappa. (2022). Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning. 2nd Ed. O’Reilly
[ g ] Lakshmanan, Valliappa. (2018). Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning. O’Reilly.
McGregor, Susan E. (2021). Practical Python Data Wrangling and Data Quality: Getting Started with Reading, Cleaning, and Analyzing Data. O’Reilly.
[ h ] McKinney, Wes. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, 2nd Ed. O’Reilly.
Nield, Thomas. (2022). Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics. O’Reilly.
Provost, Foster & Tom Fawcett. (2013). Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. O’Reilly.
Tuckfield, Bradford. (2023). Data Science for Business People. No Starch Press.
[ g ] Tuulos, Ville. (2022). Effective Data Science Infrastructure: How to make data scientists productive. Manning.
[ g ] VanderPlas, Jake. (2016). Python Data Science Handbook: Essential Tools for Working with Data. O’Reilly.
Data Viz
Dale, Kyran. (2022). Data Visualization with Python and JavaScript. 2nd Ed. O’Reilly.
[ g ] Dougherty, Jack & Ilya Ilyankou. (2021). Hands-On Data Visualization: Interactive Storytelling from Spreadsheets to Code. O’Reilly.
[ g ] Fisher, Danyel & Miriah Meyer. (2018). Making Data Visual: A Practical Guide to Using Visualization for Insight. O’Reilly.
Healy, Kieran. (2019). Data Visualization: A Practical Instroduction. Princeton University Press.
[ g ] Murray, Scott. (2017). Interactive Data Visualization for the Web: An Introduction to Designing with D3, 2nd Ed. O’Reilly.
[ g ] Thomas, Stephen A. (2015). Data Visualization with JavaScript. No Starch Press.
[ g ] Wilke, Claus O. (2019). Fundamentals of Data Visualization: A Primer on Making Informative and Compelling Figures. O’Reilly.
More
Alexopoulos, Panos. (2020). Semantic Modeling for Data: Avoiding Pitfalls and Breaking Dilemmas. O’Reilly.
Eryurek, Evren et al. (2021). Data Governance The Definitive Guide: People, Processes, and Tools to Operationalize Data Trustworthiness. O’Reilly.
[ h ][ g ] Kleppmann, Martin. (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly.
Reis, Joe & Matt Housley. (2022). Fundamentals of Data Engineering: Plan and Build Robust Data Systems. O’Reilly.
Seldess, Jesse, Ben Darnell, & Guy Harrison. (2022). CockroachDB: The Definitive Guide: Distributed Data at Scale. O’Reilly.
Strengholt, Piethein. (2020). Data Management at Scale: Best Practices for Enterprise Architecture. O’Reilly.
[ h ] Uttamchandani, Sandeep. (2020). The Self-Service Data Roadmap: Democratize Data and Reduce Time to Insight. O’Reilly.
Terms#
[ w ] Column
[ w ] Column-Oriented DBMS
[ w ] Create, Read, Update, Delete (CRUD)
[ w ] Data Architecture
[ w ] Data Cleaning
[ w ] Data Extraction
[ w ] Data Integration
[ w ] Data Integrity
[ w ] Data Lake
[ w ] Data Management
[ w ] Data Migration
[ w ] Data Pipeline
[ w ] Data Processing
[ w ] Data Retrieval
[ w ] Data Sink
[ w ] Data Storage
[ w ] Data Transformation
[ w ] Data Type
[ w ] Data Warehouse
[ w ] Database
[ w ] Database Management System (DBMS)
[ w ] Database Modeling
[ w ] Database System
[ w ] Enterprise Data Model
[ w ] Extract, Load, Transform (ELT)
[ w ] Extract, Transform, Load (ETL)
[ w ] Graph Database (GDB)
[ w ] Heterogeneous Database System (HDB)
[ w ] Hierarchical Database Model
[ w ] MUMPS
[ w ] NoSQL
[ w ] Object Database
[ w ] Object-Oriented Database Management System (OODBMS)
[ w ] Object-Relational Database (ORD)
[ w ] Object-Relational Database Management System (ORDBMS)
[ w ] Object-Relational Mapping (ORM)
[ w ] Online Analytical Processing (OLAP)
[ w ] Online Transaction Processing (OLTP)
[ w ] Operational Data Store
[ w ] Query Language
[ w ] Relational Database Management System (RDBMS)
[ w ] Row
[ w ] Structured Query Language (SQL)
[ w ] Unstructured Data