Chapter 2
This chapter introduces some popular tools.
Chocolately is a popular package manager for Windows that makes it easy to install, update, and manage software packages. It offers a large selection of packages and advanced features.
See also Winget.
Docker is a platform for building, shipping, and running applications in containers.
Git is a popular version control system that allows developers to track changes to their code and collaborate with others on a project. It provides a way to manage and organize code, and allows for easy branching and merging. Git is widely used in software development, and is an essential tool for any developer’s toolkit.
GitHub is a web-based platform that provides a range of features for managing Git repositories. It allows developers to host their code online, collaborate with others on a project, and track issues and bugs. GitHub is widely used in the open-source community and is a popular tool for managing software development projects.
Homebrew is a package manager for macOS that makes it easy to install, update, and manage software packages.
Jupyter is a popular web-based interactive computing environment that
allows data analysts to create and share documents
containing live code, visualizations, and narrative text.
PowerShell is a command line shell and scripting language developed by Microsoft. It is designed to automate system administration tasks and provide an extensible platform for developers to write their own scripts and tools. PowerShell is widely used on Windows systems, and is becoming increasingly popular as a cross-platform tool for managing and automating IT infrastructure.
Visual Studio Code, often referred to as VS Code, is a lightweight but powerful source code editor that is popular among developers.
It is highly customizable and supports a wide range of programming languages,
making it a versatile tool for developers of all skill levels.
VS Code also has a large ecosystem of extensions that can be used to extend its functionality.
Winget is a newer lightweight package manager for Windows 10 developed by Microsoft that makes it easy to install, update, and manage software packages.
See also Chocolatey.
In addition, there are several important language-specific tools.
Conda is a popular package manager for Python often used with the Anaconda or Miniconda distributions.
See also pip.
pip is the default and widely-used package manager for Python
that makes it easy to install, update, and manage Python packages and dependencies.
It is an essential tool for working with Python projects.
See also conda.
Subsections of Tools
Chocolatey
Chocolatey is a package manager for Windows,
similar to Homebrew for macOS.
It simplifies the installation, updating, and management of Windows software,
including command-line tools, applications, and libraries.
Chocolatey uses NuGet infrastructure and PowerShell to manage packages,
making it a powerful tool for Windows users.
Alternatives
Microsoft has been working on a package manager called Winget.
Winget is an official package manager developed by Microsoft,
and it is designed to be the native package manager for Windows.
It is gaining new features and improvements over time.
To choose the best package manager for your needs, consider the following.
- Both Chocolatey and Winget have growing communities.
- Chocolatey has been around for longer and has a larger repository of packages.
- As Winget gains traction, its community and package offerings will likely grow.
Official support
- As an official Microsoft product, Winget may receive better long-term support and integration with the Windows ecosystem.
- This could make it a more future-proof choice.
Features and functionality
- Chocolatey has more mature features and a comprehensive set of tools.
- However, Winget is expected to gain more features and improvements over time.
Docker
Docker is an open-source platform that automates the deployment, scaling, and management of applications by using containerization technology. It allows developers to package an application and its dependencies (libraries, configuration files, etc.) into a single, lightweight, and portable container. These containers can run consistently across different environments, simplifying application development, testing, and deployment.
Docker provides the following features.
Containerization
Docker uses containerization to isolate applications and their dependencies
into separate, self-contained units. This approach ensures that each application
runs in a consistent environment, reducing conflicts and improving security.
Image Management
Docker images are templates used to create containers. They are lightweight and
can be easily shared, stored, and versioned. Docker Hub, the official public
registry, hosts thousands of pre-built images for various programming languages, frameworks, and tools.
Portability
Docker containers can run on any system that supports Docker, regardless of the
underlying infrastructure or platform. This makes it easy to deploy and migrate
applications across different environments, such as development, testing, and
production.
Scalability
Docker enables horizontal scaling of applications by allowing you to deploy
multiple instances of the same container. This approach can help distribute the
load across multiple resources and improve application performance.
Version Control
Docker images can be versioned and stored in registries, making it easy to
rollback, upgrade, or downgrade applications as needed. This also facilitates
collaboration among team members, as they can share and use the same
image versions.
Ecosystem
Docker has a rich ecosystem of tools and services and many
third-party tools and plugins integrate with
Docker to extend its functionality.
Managing Containers
Docker containers can be managed with Kubernetes,
a popular open-source container orchestration platform. Kubernetes is designed to automate the deployment, scaling, and management of containerized applications, including Docker containers.
Kubernetes provides features such as automatic scaling, self-healing, and load balancing.
Kubernetes can manage Docker containers running on a single host or across a cluster of hosts, abstracting away the underlying infrastructure and providing a consistent and scalable platform for running containerized workloads.
Technologies such as Docker Swarm, Apache Mesos, Nomad, and OpenShift perform similar functions to Kubernetes.
Installation
The installation process for Docker depends on your operating system. Follow the instructions below based on your platform.
Common Files
When working with Docker, you’ll encounter several common files.
Dockerfile
File used to define the steps required to build a Docker image.
Dockerfile contains instructions such as
- FROM - specifies the base image to use
- RUN - runs commands to install dependencies and set up the environment
- COPY - copies files from the host machine into the image
- CMD - specifies the command to run when the container is started
docker-compose.yml
Defines and runs multi-container Docker applications.
docker-compose.yml allows developers to define the services that make up
the application, their dependencies, and how they are connected.
This file can be used to start, stop, and manage
containers in a multi-container application.
.dockerignore
Like .gitignore in Git repositories,
.dockerignore is used to specify files and directories
that should be excluded from the Docker build context.
By excluding unnecessary files and directories,
the Docker build process is faster and more efficient.
Dockerfile.dev
Dockerfile.dev is a Dockerfile variant for development environments.
It contains additional instructions for setting up a development environment,
such as installing development tools and enabling debugging.
See Also
Learn more about Docker and the associated tools.
Subsections of Docker
Docker: Installation
Docker is an open-source platform that automates the deployment, scaling, and management of applications by using containerization technology.
Use Docker to create, manage, and deploy containerized applications.
Mac/Linux Users
Option 1: Official installation instructions. Follow instructions on the official Docker website. This is the most up-to-date and comprehensive guide to installing Docker on your system.
Option 2: Step-by-step installation guide. Check out our installation instructions for a step-by-step guide.
Windows Users
Option 1: Official installation instructions. Follow the instructions on the official Docker website. This is the most up-to-date and comprehensive guide to installing Docker Desktop on your Windows system.
Option 2: Step-by-step installation guide. Check out our installation instructions for a step-by-step guide.
Subsections of Docker: Installation
Docker: Mac/Linux
The best way to install Docker for Mac and Linux is by using Docker Desktop
(for Mac) and Docker Engine (for Linux). Docker provides a complete development
environment for containerized applications.
Warning: Docker is a resource-intensive application that may consume a significant amount of disk space, memory, and CPU resources. Installing and running Docker on your system may slow down your machine, especially if it has limited resources. Make sure your system meets the minimum requirements before installing Docker, and consider monitoring resource usage to ensure optimal performance.
Follow these steps to install Docker on Mac and Linux.
For Mac:
Ensure your system meets the requirements:
- macOS 10.14 (Mojave) or later
Download Docker Desktop for Mac from the official Docker website.
Run the installer:
- Double-click the downloaded Docker Desktop Installer.dmg file and follow the on-screen instructions.
Start Docker Desktop:
- After the installation is complete, Docker Desktop should start automatically. If it doesn’t, you can launch it from the Applications folder.
- You will see the Docker icon in the menu bar, indicating that Docker is running.
Verify the installation:
- Open a Terminal window.
- Run the following command to check the Docker version.
docker --version
Run a test container to ensure that Docker is working correctly.
docker run hello-world
For Linux:
Choose the appropriate installation instructions for your Linux distribution from the official Docker Engine documentation.
Follow the provided instructions to install Docker Engine on your system.
Verify the installation.
- Open a Terminal window.
- Run the following command to check the Docker version.
docker --version
Run a test container to ensure that Docker is working correctly.
docker run hello-world
Save Resources
Docker takes a lot of resources.
You may want to stop Docker when you are not using it.
For Mac
- Locate the Docker icon in the menu bar, which is typically located in the upper-right corner of the screen.
- Click on the Docker icon to open the dropdown menu.
- Click on “Quit Docker Desktop” or “Exit” to stop Docker Desktop.
For Linux
- Open a Terminal window.
- Run the following command to stop the Docker daemon.
sudo systemctl stop docker
To start Docker again, simply launch the application from the Applications folder (Mac) or run the following command in a Terminal window (Linux):
sudo systemctl start docker
Docker: Windows
The best way to install Docker for Windows is by using Docker Desktop.
Docker Desktop is an easy-to-use application that allows you to run
containers on your Windows machine. It includes both Docker Engine and Docker
Compose, providing a complete development environment for containerized applications.
Warning: Docker is a resource-intensive application that may consume a
significant amount of disk space, memory, and CPU resources.
Installing and running Docker on your system may slow down your machine,
especially if it has limited resources. Make sure your system meets
the minimum requirements before
installing Docker, and consider monitoring resource usage to ensure optimal performance.
Follow these steps to install Docker Desktop for Windows.
- Ensure your system meets the requirements:
Windows 10 64-bit: Pro, Enterprise, or Education (Build 16299 or later) or Windows 11.
Virtualization must be enabled in the BIOS. You can usually find this setting under “CPU Configuration,” “Virtualization,” or “VT-x” settings.
Download Docker Desktop for Windows from the official Docker website. (600+ MB).
Run the installer:
- Double-click on the downloaded Docker Desktop Installer.exe file to start the installation process.
- Follow the on-screen instructions, accepting the default settings or customizing them as needed.
- Start Docker Desktop:
- After the installation is complete, Docker Desktop should start automatically.
- If it doesn’t, you can launch it from the Start menu.
- You will see the Docker icon in the system tray, indicating that Docker is running.
- Right-click on the icon and select “Dashboard” to open the Docker Desktop dashboard.
- Verify the installation:
- Open a command prompt or PowerShell window.
- Run the following command to check the Docker version.
docker --version
- Run a test container to ensure that Docker is working correctly.
docker run hello-world
Save Resources
To stop Docker Desktop when you are not using it:
Locate the Docker icon in the system tray, which is typically located in the lower-right corner of the screen.
Right-click on the Docker icon to open the context menu.
Click on “Quit Docker Desktop” or “Exit” to stop Docker Desktop.
Git
Git is a popular tool used to help collaborate with others and keep track of code changes over time.
At a high level, Git is a version control system for tracking changes in evolving code projects.
Using Git allows you to easily revert to an earlier version of code if you make a mistake or if a change causes unexpected problems.
Git makes it easy to collaborate with others on code.
You can use Git to share your code with others,
track changes that they make, and
merge their changes back into your codebase.
This makes it a great tool for open source development,
where many people may be working on the same codebase at the same time.
In this Git introduction, we’ll start with the basics of using Git, including setting up your Git environment, creating a repository, and making commits. We’ll also cover more advanced topics like branching, merging, and collaborating with others.
Installation
The installation process for Git depends on your operating system. Follow the instructions below based on your platform:
Configuration
After installing, configure Git with your name and email address.
Using Git
When it comes to using Git, you have a few options for how to interact with it.
One option is to use Git in the terminal,
which involves typing out commands and working with the Git
command line interface.
Another option is to use a Git integration in your Integrated Development
Environment (IDE), such as Visual Studio Code (VS Code).
Using Git in the terminal can be a bit intimidating,
as it requires memorizing and typing out specific commands.
However, it can be a useful skill to have,
especially if you work on projects that require using Git outside of an IDE.
On the other hand,
using a Git integration in your IDE can make the process of working
with Git more user-friendly and intuitive,
as you can often perform Git actions with a few clicks or keystrokes.
For example, VS Code has built-in Git support and provides a
visual interface for common Git actions such as committing changes,
creating branches, and merging changes.
Git Crash Course (Video)
Check out the recommended Git Crash Course (Video).
Free ProGit (Book)
Check out the free ProGit book for a comprehensive guide to using Git.
See Also
Subsections of Git
Git: Installation
Git is a widely-used version control system that helps data analysts and developers track changes to their code and collaborate with others.
Mac/Linux Users
Option 1: Official installation instructions. Follow instructions on the official Git website. This is the most up-to-date and comprehensive guide to installing Git on your system.
Option 2: Step-by-step installation guide. Check out our installation instructions for a step-by-step guide.
Windows Users
Option 1: Official installation instructions. Follow instructions on the official Git website. This is the most up-to-date and comprehensive guide to installing Git on your system.
Option 2: Step-by-step installation guide. Check out our detailed installation instructions for a step-by-step guide.
Use Git to manage your code and collaborate with others.
Subsections of Git: Installation
Git: Mac/Linux
Task 1 - Download and install Git
- Open a terminal window
- Run the following command to install Git:
sudo apt-get install git
- (for Debian/Ubuntu-based systems) or
brew install git
- (for macOS)
- Open a terminal window
- Run the following commands to configure Git with your name (your real name, e.g. “Denise Case”) and the email address you used for GitHub.
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
- Important: Replace “Your Name” with your name and “your.email@example.com” with the email address associated with your GitHub account
- This configuration will be used for all of your Git repositories
Task 3 - Verify
- Run the following command to verify your Git configuration:
- You should see your name and email address listed under the “user” section
- If the information is not correct, you can run the
git config
command again to update it
Git: Windows
Task 1 - Download and install Git
- Go to the Git download page at https://git-scm.com/download/win
- Click the “Download” button to download the Git installer
- Run the installer file that you downloaded
- Accept the default installation options and click “Install”
- Choose the appropriate options for line ending conversion and terminal emulator during the installation process
- Open a command prompt or PowerShell window
- Run the following commands to configure Git with your name (your real name, e.g. “Denise Case”) and the email address you used for GitHub:
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
- Important: Replace “Your Name” with your name and “your.email@example.com” with the email address associated with your GitHub account
- This configuration will be used for all of your Git repositories
Task 3 - Verify
- Run the following command to verify your Git configuration:
- You should see your name and email address listed under the “user” section
- If the information is not correct, run the
git config
command again to update it
Git: Basics
Git is a widely-used version control system that helps you track changes to your code and collaborate with others.
With Git, you can create a complete history of your work, from the initial commit to the latest changes. This makes it easy to work on a project with others, keep track of your progress, and recover from mistakes.
Creating a Repository
To get started with Git, you need to create a repository.
This is where you’ll store your code and track changes to it.
There are several ways to create a repository:
Clone an existing repository: If you want to work on code that’s already been created and shared by someone else, you can clone their repository to your local machine. To do this, you’ll need the repository’s URL and you can use the git clone
command to create a local copy of the code.
Fork an existing repository: If you want to make changes to someone else’s code and contribute those changes back to their repository, you can fork their repository. This creates a copy of their repository in your GitHub account, which you can then clone to your local machine and work on.
Create a new repository: If you want to start a new project from scratch, you can create a new repository by clicking the “+” sign in the top right corner of your GitHub account.
Getting Code onto Your Machine
Once you have a repository set up, you’ll want to get the code onto your local machine so you can work on it.
To do this, clone the repository using the git clone sourceurl
command. Change sourceurl to the address shown in the browser when viewing the root folder of the repository. This will create a local copy of the repository on your machine that you can work with.
Saving Changes with Git
Once you have a copy of the repository on your machine, you can make changes to the code and save those changes to the repository using Git. The basic workflow for this is:
Add changes: Use the git add .
command to add the changes you’ve made to the code to the staging area. It’s said “git add dot”. See the dot at the end? That means add all the newly created files into source control.
Commit changes: Use the git commit -m "add feature n
command to save the changes to the local repository with a descriptive commit message.
Push changes: Use the git push origin main
command to push the changes from your local repository back up to the remote repository on GitHub.
This sequence of commands is very common:
git add .
git commit -m "tell us what you did"
git push origin main
Editing on your Machine
We typically like to edit files on our machine using editors
like VS Code or IDEs like PyCharm and Spyder.
These local tools provide advanced features including syntax highlighting,
code completion, and debugging, which can make our work more efficient.
Editing in the Cloud
However, the power of our local editors and IDEs is increasingly
becoming available in the cloud, and we can make many updates to
our repositories right from the GitHub web interface.
For example, you can use the github.dev web-based editor to
edit files and commit your changes.
It’s important to note that if we edit files both on our machine
and in the cloud, we can end up with conflicts when trying to merge our changes.
Therefore, it’s important to ensure that we always pull down the latest
changes from the cloud before making any local edits, and that we
push our changes back up to the cloud as soon as we’re finished with them.
Using the git pull command will bring any changes made directly
in your GitHub (or other cloud) repository down to your machine.
Read more about the github.dev editor at:
In Git, “origin” is a shorthand name that refers to the remote repository where your code is stored.
When you clone a repository,
Git automatically creates an “origin” remote that points to the original repository on the server.
You can use this remote to pull changes from the server or push your local changes back to it.
You can add more than one remote to a repository.
Git branches are separate lines of development that allow multiple
contributors to work on different features or versions of a project simultaneously.
The default branch in Git is now called “main”, but “master” was previously used,
so you may still see it.
Pull and Push
When we use git pull
,
Git already knows the source and destination of the changes (i.e.,
the remote and local repositories)
because it’s been configured using the git clone command.
When we use git push
, we need to specify both the remote repository
(the source) and the branch we want to push the changes to (the destination).
The origin in git push origin main
refers to the remote repository we want
to push the changes to,
and main refers to the branch on the remote repository that we want to
update with our changes.
Git: Branches
In Git, we are always working on a branch of code,
which is like a separate “timeline” for the code.
Default Branch
The default branch is employed automatically when we first create a repository,
and it is typically and by default named main. On older repos,
you may see a master branch instead, but the old terminology is
discouraged and easy to update.
Working Alone
For independent projects, we may work directly on the main default branch.
Individual developers may choose to use branches to work on
new features or fixes without affecting their main codebase.
Working Together
In a professional environment, it’s generally recommended to create
new branches for new features or changes to avoid conflicts with other
developers and to make it easier to manage and review changes.
Individual developers can also use branches to experiment with new features
or make changes without affecting the main codebase.
We can make changes, commit them to our branch, and then merge our
branch back into the default branch when appropriate.
Multiple branches allow a team to work on different features or changes
at once without worrying about conflicts or breaking the main codebase.
Once we’re satisfied with our changes on a branch, we can create
a pull request to request that the changes be reviewed and
merged into the default branch. Team leads can then review and merge
the changes as needed. The default branch is typically set to “main”
and is the primary branch for the project.
You can create a new branch with the git branch
command,
and switch to that branch with the git checkout
command.
Once you’re on the new branch, any changes you make and commit
will only affect that branch.
To merge a branch back into the main codebase, you can use
the git merge
command. This will bring any changes from the
branch into the main codebase,
and you can resolve any conflicts that arise during the merge.
Git branches are an important tool for managing complex
projects with multiple contributors,
and they allow for efficient collaboration and code review.
Git: Configuration
After installing, configure Git with your name and email.
Use your GitHub email for best results.
Open Git Bash on Windows
To open Git Bash on Windows:
- Press the Windows key on your keyboard to open the Start menu.
- Type “Git Bash” into the search bar and select it from the list of results.
- Git Bash should now open in a new window.
Open Terminal on Mac or Linux
On Mac or Linux, open Terminal app.
Check Git Configuration
Type the following command to display your Git configuration:
Look for the following lines in the output:
user.name=Your Name
user.email=your.email@example.com
If you see your name and email listed, then they are set in Git.
Set Git Configuration
If you don’t see your name and email listed,
set them using the following commands:
git config --global user.name "Your Name"
git config --global user.email your.email@example.com
Replace “Your Name” and “your.email@example.com” with
your actual name and
email address.
The --global
flag ensures the settings are applied globally
across all your Git repositories.
Git: Conflicts
We can edit project files in at least two places:
- locally, on our machine
- in the cloud, e.g., by using the editing features in GitHub
Bad Practices
We want to keep our local version and cloud version in sync at all times.
Some of the worst things we can do are:
- Forget to pull before we start our work.
- Pull code and leave it for a long time, then start working on old, stale code.
- Make huge, expansive contributions that take a long time (unless we know how to branch - an intermediate Git skill.)
- Wait to push our completed changes to the cloud.
Good Practices
To minimize the chance of conflits:
- Always pull code before you start working locally. Never work on stale code!
- Make small, incremental changes.
- As soon as you finish a useful contribution, git add, commit, and push up to the cloud.
Keep your local and cloud repositories synchronized. Use these for each session.
Before you start:
After you finish a set of edits:
git add .
git commit -m "add title"
git push
When working collaboratively, communicate with team members and establish a clear workflow.
Ensure the team knows who is working on which files and when changes are being made.
You might create different small,
focused branches that don’t overlap much in terms of the files they modify.
Merge Conflicts
Merge conflicts can occur when:
- two people edit the same file simultaneously
- changes are made to a file both locally and in the cloud at the same time.
- two branches with different changes are merged.
For example, we might use the GitHub cloud editor to make a quick
fix to our README.md - forgetting that we’re also in the process of updating installation instructions on the local README.md.
Merge conflicts can be frustrating,
but they are an inevitable part of collaborative work.
If you do run into a merge conflict,
don’t worry - it’s not the end of
the world.
Git provides tools to help you resolve conflicts and merge
changes together.
The first step is to understand which files have
conflicts by running git status.
The files with conflicts will be
marked as “unmerged”.
To resolve the conflict,
open the conflicted file and look for the conflicting sections
marked with «««< HEAD, =======, and »»»>.
Manually edit the file to remove the conflicting sections
and keep the changes you want.
Once you’ve resolved the conflict,
fstage the changes with git add and commit them with git commit.
If you’re still unsure how to resolve the conflict,
ask for help from your team members or consult Git documentation.
Stay calm and take your time to carefully resolve the conflict.
Experience managing merge conflicts can be very valuable.
Git: Crash Course
Student-recommended video on Git - definitely worth sharing!
It covers things in a similar way and you can jump right to the parts you need.
Note: Watch when you have time or use it when you’re ready to learn more about Git.
Many students find it very helpful.
I don’t know how anyone could provide more information, more efficiently than this.
https://www.youtube.com/watch?v=RGOj5yH7evk
Git and GitHub for Beginners - Crash Course
Over 2 million views.
From the video description:
Learn about Git and GitHub in this tutorial.
These are important tools for all developers to understand.
Git and GitHub make it easier to manage different software versions and make it easier for multiple people to work on the same software project.
This course was developed by Gwen Faraday.
⭐️ Contents ⭐️
⌨️ (0:00) Introduction
⌨️ (1:10) What is git?
⌨️ (1:30) What is version control?
⌨️ (2:10) Terms to be learn in video
⌨️ (5:20) Git commands
⌨️ (7:05) sign up in GitHub
⌨️ (11:32) using git in local machine
⌨️ (11:54) git install
⌨️ (12:48) getting code editor
⌨️ (13:30) inside VS Code
⌨️ (14:30) cloning through VS Code
⌨️ (17:30) git commit command
⌨️ (18:15) git add command
⌨️ (19:15) committing
⌨️ (20:20) git push command
⌨️ (20:30) SSH Keys
⌨️ (25:25) git push
⌨️ (30:21) Review workflow so far
⌨️ (31:40) Compare between GitHub workflow and local git workflow
⌨️ (32:42) git branching
⌨️ (56:30) Undoing in git
⌨️ (1:01:50) Forking in git
⌨️ (1:07:55) Ending
Git: Remotes
In Git, the term “origin” refers to the default remote repository that a local repository is linked to.
When you clone a repository from a remote server to your local machine,
Git automatically sets up the “origin” remote for you. This allows you to push changes from your local repository to the remote repository,
and pull changes from the remote repository to your local repository.
When you clone a repository,
Git sets up the origin remote by default,
pointing to the repository you cloned from.
This means that when you push changes to the remote repository,
they will be added to the branch on the remote repository that you cloned from.
Using the “origin” remote allows you to collaborate with others by sharing changes to the same repository.
When someone else pushes changes to the remote repository,
you can pull those changes down to your local repository
and merge them with your own changes.
However, if you edit the same file in
both your local repository and the remote repository,
conflicts can arise.
To avoid conflicts,
it’s important to always pull down changes
from the remote repository
before making your own changes,
and to carefully review any merge conflicts that arise.
Working with Remote Repositories
Git provides a set of commands that allow you to work with remote repositories. Here are some commonly used commands:
git remote
- List the remote repositories that are connected to your local repository.
git remote -v
- List the remote repositories along with their URLs.
git remote add <name> <url>
- Add a new remote repository to your local repository. The name
parameter is the name you want to give the remote, and url
is the URL of the remote repository.
git remote rm <name>
- Remove a remote repository from your local repository.
git push <remote> <branch>
- Push your local changes to a remote repository. The remote
parameter is the name of the remote repository, and branch
is the branch you want to push to.
git pull <remote> <branch>
- Pull changes from a remote repository into your local repository. The remote
parameter is the name of the remote repository, and branch
is the branch you want to pull from.
git fetch <remote>
- Fetch the changes from a remote repository, but don’t apply them to your local repository.
git clone <url>
- Clone a remote repository to your local machine.
Git Learning: Concepts Over Memorization
Learning every Git command by heart is not necessary nor efficient.
Instead, focus on understanding the concepts and workflows of Git,
and how the commands fit into those workflows.
The vast amount of online resources available will serve as
reliable references when you need them.
As you work with Git more frequently,
the most common commands will become second nature.
However, for the rest, don’t hesitate to look them up.
Remember that the value of Git lies not in memorizing commands
but in leveraging its powerful version control capabilities
to manage your projects effectively.
GitHub
GitHub is a popular, web-based platform that allows data analysts and developers
to store and manage their code and collaborate with others.
GitHub is built on Git,
which is a distributed version control system that allows developers to track changes
to their code over time and collaborate with others on the same codebase.
With GitHub, developers can create their own repositories,
which are essentially folders that contain their code, documentation,
and other files related to a specific project. They can also fork other people’s
repositories to create their own copies, which they can then modify and
contribute back to the original repository.
This allows for easy collaboration and code sharing among developers.
GitHub provides tools for developers to manage their code,
such as the ability to track and resolve issues,
review and merge pull requests, and create and manage branches.
It also provides a web-based interface for viewing and editing code,
as well as a built-in code editor.
Additionally, it has a wide range of integrations and APIs that allow developers
to automate various development tasks and integrate with other tools and services.
Sign Up For A Free Account
Sign up for a free account with GitHub.com,
a code hosting platform that manages a vast number of programming projects.
Follow their website instructions to get started.
See the recommendations on GitHub email and username below.
GitHub Email
You’ll need an email.
I use a permanent personal email for most GitHub work,
rather than a work or school account (which may be temporary).
Your email will not be made public.
GitHub Username
You’ll create a GitHub username.
Your username will be public.
Your username can be anonymous (e.g., ‘analystextraordinaire’)
or publicly associated with you.
For example, I use ‘denisecase’.
Your username will be a part of the URL to all of your projects.
Students New to GitHub
- Recruiters may look at GitHub and LinkedIn profiles - it can be helpful to show your skills using modern tools.
- Be courageous. The best way to learn is by doing, and don’t be too concerned about making mistakes.
- Git mistakes and do-overs are common getting started.
- Learning to fix issues is a key skill in data analytics.
- Keep and share your latest, most useful, and best work in GitHub.
GitHub Repositories
Each coding project lives in a GitHub repository (called ‘repo’ for short) in the ‘cloud’ (a distributed group of machines).
Git (the system) keeps track of committed changes to an evolving project.
- The GitHub repo can be kept in sync with a git repo on your local machine.
- For example
- If a GitHub repo is named datafun-01-getting-started
- On my machine, it’s in my Documents/datafun-01-getting-started directory
Quick Quiz
Go to: https://github.com/denisecase/datafun-01-getting-started
Q: What is the username?
Q: What is the repo name in the URL?
Get Started
After you have an account, you can use the Get Started Guide
that the GitHub team has created to help you understand the platform.
For more information on getting started on GitHub,
view the “Getting Started with GitHub” video below from the GitHub Training & Guides Youtube Channel.
More About GitHub
The following definition of GitHub comes from Kinsta.com
At a high level, GitHub is a website and cloud-based service that helps developers store and manage their code, as well as track and control changes to their code. To understand exactly what GitHub is, you need to know two connected principles: Version control, which helps developers track and manage changes to a software project’s code, and Git, which is a specific open-source version control system.
Learn more about GitHub in the following video from the GitHub YouTube.
Free Stuff For Students
For more fun stuff, check these out.
See Also
There is more information about GitHub in the Hosting Chapter.
Homebrew
Homebrew is a package manager for macOS and Linux that simplifies
the installation, updating, and management of software on your system.
Homebrew allows you to install various command-line tools, applications,
and libraries with ease. It is designed to work seamlessly with macOS and
Linux, providing a user-friendly interface for managing software packages.
Jupyter
Jupyter is an open-source web application that allows users to
create and share documents that contain live code, equations,
visualizations, and narrative text.
It is a popular tool for data analysis, scientific computing,
and machine learning, and is widely used in academic research,
industry, and data science education.
Jupyter gets its name from Julia-Python-and-R - some of the original programming languages supported.
Jupyter provides the following features.
Interactive Computing
Jupyter notebooks allow users to write and execute code interactively, providing
an interactive computing environment. This allows users to explore data, prototype
algorithms, and create visualizations in a single, cohesive environment.
Multiple Language Support
Jupyter supports multiple programming languages, including Python, R, and Julia.
This makes it easy to integrate different tools and frameworks and collaborate
with colleagues who use different programming languages.
Collaboration
Jupyter notebooks can be shared with others, allowing for easy collaboration and
reproducibility of analyses. This also facilitates communication and knowledge sharing
among team members and stakeholders.
Visualization
Jupyter notebooks support interactive visualization libraries such as Matplotlib,
Bokeh, and Plotly, making it easy to create and share data visualizations.
Integration
Jupyter notebooks can be integrated with other tools and frameworks such as
Git, GitHub, and Docker. This makes it easy to manage version control, share
code and data, and deploy projects.
Ecosystem
Jupyter has a rich ecosystem of tools and services, such as
JupyterLab, JupyterHub, and Binder,
that can help streamline the development and deployment process.
Many third-party tools and plugins also integrate with Jupyter to
extend its functionality.
Jupyter Installation
The installation process for Jupyter depends on your operating system and your
preferred installation method.
Follow the instructions below based on your platform.
Jupyter Ecosystem
Here’s a short guide to clarify some of the terms used with Jupyter.
JupyterLab: An interactive development environment (IDE) for working with Jupyter notebooks, code, and data. It provides a flexible and powerful user interface that can be customized to suit the needs of individual users.
Jupyter Notebook: A web-based interactive computational environment for creating and sharing Jupyter notebooks, which allow you to create and share documents that contain live code, equations, visualizations, and narrative text.
JupyterHub: A multi-user server that allows multiple users to access Jupyter notebooks and other resources from a shared server. It is commonly used in educational settings or for collaborative research projects.
Jupyter Book: A tool for building beautiful, publication-quality books and documents from computational material, such as Jupyter notebooks. It provides a simple way to create interactive documents with executable code and visualizations.
nbconvert: A command-line tool that converts Jupyter notebooks to other formats, such as HTML, PDF, or Markdown. This allows you to share your work with others who may not have Jupyter installed.
ipywidgets: A library for creating interactive widgets in Jupyter notebooks. Widgets are user interface elements, such as buttons and sliders, that allow you to interact with and visualize data in real time.
nbviewer: A web application that allows you to view Jupyter notebooks without having to install Jupyter yourself. You can simply paste the URL of a notebook and view it in your browser.
Get Started with Jupyter Notebooks
There are excellent resources available for getting started with Jupyter Notebooks.
See:
VS Code
Visual Studio Code (VS Code) is a free and open-source code editor developed by Microsoft.
It is available on Windows, Linux, and macOS and offers features such as debugging, syntax highlighting, and intelligent code completion.
Some of the key features of VS Code include:
- Built-in Git integration
- Support for multiple languages and frameworks
- Extensions for customizing the editor and adding new functionality
- Debugging capabilities for Node.js, Python, and other languages
- Integrated terminal for running commands and scripts
Using a modern editor or IDE can make your coding experience more efficient and productive.
Installation
The installation process depends on your operating system. Follow the instructions below based on your platform:
VS Code Extensions
VS Code extensions are add-ons that allow users to customize and
enhance the functionality of the VS Code.
For example, IntelliSense is a popular VS Code extension that
provides intelligent code suggestions, auto-completion, and parameter
hints while writing code. It is a built-in extension
enabled by default in VS Code.
To learn more about extensions, visit the official documentation at
https://code.visualstudio.com/docs/introvideos/extend.
Why VS Code
One reason we teach VS Code over other IDEs (.e.g., Spyder, PyCharm, IDLE)
is that VS Code is a more general-purpose code editor
that supports multiple languages and workflows, and works on Windows,
Mac, and Linux machines.
VS Code is capable of handling a wide range of tasks and can be used for
web development, data analysis, scripting, and more.
VS Code has a lot of built-in functionality for working with other languages
including Markdown, SQL, PowerShell, Julia, and more.
Learning VS Code is a great skill for someone
getting started with programming, data analysis, and/or automation
and wants to learn a versatile environment that will accomodate growing skills.
VS Code is widely used and well-supported,
with many resources for learning how to use it effectively.
In addition to the comprehensive official documenttaion,
there are articles and videos available for begineers through experts.
Subsections of VS Code
VS Code: Installation
PowerShell is a powerful command-line shell and scripting language designed for system administration and automation tasks. Here are some options for installing PowerShell on your system:
Windows Users
Option 1: Install via Microsoft Store. If you’re running Windows 10 or later, you can install PowerShell via the Microsoft Store. This is the recommended method, as it ensures that you have the latest version of PowerShell and allows for easy updates.
Option 2: Download the MSI installer. If you’re not able to install via the Microsoft Store, you can download the MSI installer from the PowerShell GitHub repository. Choose the appropriate version for your system architecture (32-bit or 64-bit) and follow the installation wizard.
macOS Users
Option 1: Install via Homebrew. If you’re using Homebrew on your Mac, you can install PowerShell by running the following command in your terminal: brew install --cask powershell
.
Option 2: Download the PKG installer. You can also download the PKG installer from the PowerShell GitHub repository. Choose the appropriate version for your macOS version and system architecture (Intel or Apple Silicon) and follow the installation wizard.
Linux Users
Option 1: Package manager installation. Most Linux distributions include PowerShell in their package repositories. You can search for PowerShell in your package manager and install it from there. For example, on Ubuntu or Debian, you can run sudo apt-get install powershell
.
Option 2: Download the package manually. You can also download the package for your distribution directly from the PowerShell GitHub repository and install it manually. Follow the instructions for your specific distribution on the download page.
Once you have PowerShell installed, you can use it to perform a wide range of tasks and automate common system administration tasks. Happy scripting!
Winget
Winget (Windows Package Manager) is an official package manager for
Windows systems, developed by Microsoft.
It simplifies the process of discovering, installing, upgrading, and removing software on Windows machines.
Winget provides command-line access to manage software packages, s
imilar to package managers on Linux and macOS systems.
With Winget, you can search for, install, update, and uninstall
software packages without having to manually navigate to a website,
download installers, or follow installation wizards.
Winget automates these tasks and makes it easy to manage software on your Windows system.
Alternatives
For a while yet, Chocolatey is a popular alternative.
Chocolatey has been around for a longer time,
offering a mature set of features and a large repository of packages.
The Chocolatey community is well-established, and it has extensive
documentation and support.
Chocolatey is known for its versatility and integration
with various Windows tools, such as PowerShell and NuGet infrastructure.
This makes it a popular choice for many Windows users looking for a
reliable and comprehensive package management solution.