After covering the importance of Julia to the data science community and several essential data science principles, we start with the basics including how to install Julia and its powerful libraries. Many examples are provided as we illustrate how to leverage each Julia command, dataset, and function.
Specialized script packages are introduced and described. Hands-on problems representative of those commonly encountered throughout the data science pipeline are provided, and we guide you in the use of Julia in solving them using published datasets. Many of these scenarios make use of existing packages and built-in functions, as we cover:
- An overview of the data science pipeline along with an example illustrating the key points, implemented in Julia
- Options for Julia IDEs
- Programming structures and functions
- Engineering tasks, such as importing, cleaning, formatting and storing data, as well as performing data preprocessing
- Data visualization and some simple yet powerful statistics for data exploration purposes
- Dimensionality reduction and feature evaluation
- Machine learning methods, ranging from unsupervised (different types of clustering) to supervised ones (decision trees, random forests, basic neural networks, regression trees, and Extreme Learning Machines)
- Graph analysis including pinpointing the connections among the various entities and how they can be mined for useful insights.
Each chapter concludes with a series of questions and exercises to reinforce what you learned. The last chapter of the book will guide you in creating a data science application from scratch using Julia.