-
Notifications
You must be signed in to change notification settings - Fork 0
/
Git Exercise.Rmd
187 lines (142 loc) · 12.1 KB
/
Git Exercise.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
---
title: "Git Basics <br><small>Advanced Data Analytics</small></br>"
author: "BIOL 5660"
output:
html_notebook:
df_print: paged
rows.print: 10
theme: cosmo
highlight: breezedark
number_sections: yes
toc: yes
toc_float:
collapsed: no
smooth_scroll: yes
pdf_document: default
html_document:
toc: yes
df_print: paged
editor_options:
chunk_output_type: inline
mode: gfm
---
<style type="text/css">
h1.title {
font-size: 40px;
font-family: "Times New Roman", Times, serif;
color: DarkBlue;
text-align: center;
}
h4.author { /* Header 4 - and the author and data headers use this too */
font-size: 20px;
font-family: "Times New Roman", Times, serif;
color: DarkBlue;
text-align: center;
}
</style>
# What is Git?
<a href="https://git-scm.com/" target="_blank">Git</a> is a distributed version control system which is generally used in conjunction with, but is not the same as <a href="https://www.github.com/" target="_blank">GitHub</a>, which is a software used for versioning, code sharing, and publishing. Essentially, **Git** helps you keep track of changes to files within a project or project folder in snapshots, while **GitHub** provides a centralized location to store and share information. We will talk more about GitHub in our next exercise. For now we will focus on some of the primary commands to help with versioning control using Git.
## Getting Started
**Git** is an open-source software that can be installed on Window, Mac, or Linux consoles. Fortunately, **Git** has already been installed on the lab computers, but if you are using your personal laptop take this time to follow the <a href="https://git-scm.com/downloads" target="_blank">link</a> to the proper version of the software, download, and install it on your computer. Once installed, navigate to the *Git Bash* terminal or other terminal application on your computer. On *Windows* you can access this via the <u>Start Menu</u>:
<p align="center">
![*windows start menu highlighting the git bash terminal*](C:/Users/gentryc/Google Drive/APSU/Courses/BIOL 5700 Advanced Data Analytics\Exercise_7/GitBashPC.jpg "Windows Start Menu")
</p>
*Git Bash* is a command line terminal with BASH (bourne again shell) emulation for PC. Its appearance is similar to other terminals you may have interfaced with previously. On a *Mac* you can use spotlight to locate the terminal window for that operating system or to avoid directory issues, navigate to the appropriate folder and select *"new terminal at folder"* from the menu.
<p align="center">
![*git bash terminal*](C:/Users/gentryc/Google Drive/APSU/Courses/BIOL 5700 Advanced Data Analytics/Exercise_7/GitBashTerminal.jpg "Git Bash Terminal")
</p>
One of the first steps to setting up **Git** is identifing who is actually responsible for the changes to the content being edited. In order to set this up you can use the following two lines of code:
```
git config --global user.name "INSERT YOUR NAME HERE"
```
and
```
git config --global user.email "INSERT YOUR EMAIL HERE"
```
You can use ```git config --list``` if you need to review the information you provided with the other *git config* commands.
If you plan to use **Git** in combination with a code sharing or publishing service like:<br></br>
<br></br>
<p align="center">
<a href="https://bitbucket.org/product"><img src="C:\\Users\\gentryc\\Google Drive\\APSU\\Courses\\BIOL 5700 Advanced Data Analytics\\Exercise_7\\BitBucket.png" style="width:150px; height:150px" title="BitBucket" alt="BitBucket"></a> <a href="https://www.github.com/"><img src="C:\\Users\\gentryc\\Google Drive\\APSU\\Courses\\BIOL 5700 Advanced Data Analytics\\Exercise_7\\GitHub.png" style="width:150px; height:150px" title="GitHub" alt="GitHub"></a> <a href="https://www.gitkraken.com/"><img src="C:\\Users\\gentryc\\Google Drive\\APSU\\Courses\\BIOL 5700 Advanced Data Analytics\\Exercise_7\\GitKraken.png" style="width:176px; height:150px" title="GitKraken" alt="GitKraken"></a> <a href="https://about.gitlab.com/"><img src="C:\\Users\\gentryc\\Google Drive\\APSU\\Courses\\BIOL 5700 Advanced Data Analytics\\Exercise_7\\GitLab.png" style="width:164px; height:150px" title="GitLab" alt="GitLab"></a> <a href="https://sourceforge.net/"><img src="C:\\Users\\gentryc\\Google Drive\\APSU\\Courses\\BIOL 5700 Advanced Data Analytics\\Exercise_7\\SourceForge.png" style="width:150px; height:150px" title="SourceForge" alt="SourceForge"></a>
</p>
<br></br>
you may be able to interface with programs like <a href="https://www.rstudio.com/" target="_blank">RStudio</a> or <a href="https://www.anaconda.com/distribution/" target="_blank">Anaconda</a> for version control. In the next exercise we will use RStudio for version control and continue using that same method for the remainder of the course.
### Git Commands
Although we can [git](https://www.merriam-webster.com/dictionary/pun) by with using a program like RStudio because of its built-in version control, you can use Git on your machine to create a local *git repository*. A repository is like a folder for your project or project folder. Your project repository contains all of your project files and stores the revision history for each file. Repositories and be owned or shared with others utilizing that same computer or network folder. In the case of a cloud-based repository, that information can be shared by individuals on different networks. For now, we will be working locally to create a git repository.
To start, please create a folder in your student folder in the <u>Z: Drive</u> titled **7-Git**.
<br></br>
> Tip: various applications dislike the utilization of spaces in your directory path. It's good to get in the habit of replacing them with underscores or dashes.
<br></br>
Now we need to use **Git Bash** or the terminal window to provide some actions via command line. The first command will be to change the working directory for your git repository.
```
cd z:/Students/YOUR NAME/7-Git
or in the terminal window
Z:
cd \Students\YOUR NAME\7-Git
```
This will direct **Git** to identify the appropriate folder to initiate the repository. Initializing the folder lets **Git** know it needs to be tracked for changes. This can be done using the following command:
```
git init
```
The terminal window should respond with "Initialized empty Git repository in Z:/Students/YOUR NAME/7-Git/.git/" and show the directory as /z/Students/YOUR NAME/7-Git (master)
If you navigate to your student folder on the Z: Drive you should now see a new folder *.git* that was created. Within that folder there should be a number of other folder and files. If you want to see information regarding this folder you can type:
```
git status
```
Git will then report the level of the project (branch master), whether there are current commits (No commits yet) and state there is "nothing to commit (create/copy files and use "git add" to track)" information within the project folder. Now it is time to begin adding files and/or folders to be tracked by **Git**. To do this you can simply drag and drop files from a previous exercise or from a new project. So lets start by adding the information from your *1-Markdown* (data, scripts, and html) to the new **7-Git** folder you created above. You can use the ```ls``` command in bash or ```dir``` in the terminal window to see those files in the new folder.
Once you have files added to the folder you can choose to track the changes on an individual file using:
```
git add <file name>
```
Alternatively you can mark every file using the following command:
```
git add .
```
For this example, go ahead and use the command to add all of the added files and folders. Once you have added these files, run the ```git status``` command again. This should provide you with details regarding the number of files marked in the repository. This is also called *staging files*. You will see this terminology when we begin to use **Git** and **GitHub** with RStudio.
If you want to remove a file once it has been stagged you can use the following command:
```
git rm --cached <file name>
```
In order to finalize this repository, you must *commit* (sort of the **Git** term for save) these files. To do that, you must also add a commit message. This message should be concise and detail the changes or additions that are being made. For example, you could say "adding rmd" or "updated r script" or something similar to that. Use the following command to commit the changes and add your own commit message.
```
git commit -m "Place the required commit message here"
```
When we begin to utilize **GitHub** for version control we will use the *push* command to add those changes to any code sharing site. However, for this example we will not need the *push* command.
```
git push
```
To test your understanding of the process above, create a new *text file* in the directory called test.txt and try repeating the steps above using the ```git add test.txt``` command. Then commit the file using an apporpriate commit message. When finished use the following command:
```
git log
```
What does this tell you?
### Git Ignore
If you don't want to commit all of the files in a given project you can create a **.gitignore** file. This is a simple text file that creates rules for all users of the project to ignore committing particular files or folders. This file should be <u>staged and committed</u> to the repository so it is globally available for all users.
To create the *.gitignore* file, type the following command:
```
touch .gitignore
or in the terminal window
type nul > .gitignore
```
Once you have created the file, navigate to your student drive and right click on the file to open with a text editor such as Code Writer, NotePad, [TextPad](https://www.textpad.com), [Visual Studio Code](https://code.visualstudio.com/), WordPad, etc. The same text commenting syntax that operates in programs like **R** also work here. So if you include a **#** this will be treated as a comment, blank lines match no information so are not run and can be used as separators for legibility, etc.
In the *.gitignore* file, I generally like to begin with a comment regarding what needs to be ignored. The following is a simple example:
```
#Ignore these folders
images/
examples/
#Ignore these file types
*.cvs
*.txt
*.docx
*.xlsx
#Ignore these files
test_nb.Rmd
.Rhistory
```
Since the *.gitignore* file is stored in your repository, you don't need to provide the entire path to specific files or folders. **Git** and **GitHub** will search the entire repository and list of commits to ignore the files identified in the *.gitignore* document. Be sure to add and commit the file once it has been updated.
## Final Thoughts
At this point you have initalized a folder, stagged files, and committed them. Generally speaking, we will repeat all of these steps within *Rstudio* in the next exercise. However, **Git** can be used locally, without a cloud-based repository, and is a great way to manage long-term projects. Even if you aren't using [**GitHub**](https://github.com/), you should probably go ahead and [create an account](https://github.com/join), they provide a summary [cheatsheet](https://github.github.com/training-kit/downloads/github-git-cheat-sheet.pdf) for commonly used Git commands. This can be helpful if you forget some of the basic instructions for creating repositories, staging files, commit and commenting, or just to [git by](https://youtu.be/UVtpXvzzXiA?t=22).
# Your Turn!
Using the skills you acquired above, create a git repository for **Exercise 2** titled *2-GLM*. In this repository <u>include</u> the *markdown document*, *html file*, and the **R** *project files*. Additionally, create a *.gitignore* file to <u>exclude</u> any *image files* you created, the *excel dataset*, and any *ancillary folders*.
## Additional Help
After creating this exercise I stumbled onto this YouTube video, [Learn Git In 15 Minutes](https://www.youtube.com/watch?v=USjZcfj8yxE&t=602s), from [Colt Steele](https://www.youtube.com/channel/UCrqAGUPPMOdo0jfQ6grikZw/featured). This video provides a fantastic visusal, step-by-step look at several **Git** commands. If you run into issues moving forward this video can likely answer your question, and remember, [Google](https://www.google.com) is your friend.