Merge pull request #3430 from programminghistorian/publish-gestion-ma…

…nipulation-donnees-r Publish /fr/lecons/gestion-manipulation-donnees-r
programminghistorian · Jan 8, 2025 · 59374e9 · 59374e9
2 parents 66044fa + aac1dbc
commit 59374e9
Show file tree

Hide file tree

Showing 28 changed files with 522 additions and 23 deletions.
diff --git a/assets/ejemplo_introductorio_estados.csv → ...nt-in-r/ejemplo_introductorio_estados.csv b/assets/ejemplo_introductorio_estados.csv → ...nt-in-r/ejemplo_introductorio_estados.csv
diff --git a/assets/introductory_state_example.csv → ...ement-in-r/introductory_state_example.csv b/assets/introductory_state_example.csv → ...ement-in-r/introductory_state_example.csv
diff --git a/en/lessons/beginners-guide-to-twitter-data.md b/en/lessons/beginners-guide-to-twitter-data.md
@@ -130,7 +130,7 @@ At this point, your data has gone from the long list of single tweet IDs to a ro
 
 Each tweet now has lots of useful metadata, including the time created, the included hashtags, number of retweets and favorites, and some geo info. One can imagine how this information can be used for a wide variety of explorations, including to map discourse around an issue on social media, explore the relationship between sentiment and virality, or even text analysis of language of the tweets.
 
-All of these processes will probably include some light data work to format this dataset so that you can produce useful insights: [statistical analyses](/en/lessons/data-wrangling-and-management-in-R), [maps](/en/lessons/mapping-with-python-leaflet), [social network analyses](/en/lessons/exploring-and-analyzing-network-data-with-python), [discourse analyses](/en/lessons/corpus-analysis-with-antconc). But regardless of where you go from here, you have a pretty robust dataset that can be used for a variety of academic pursuits.
+All of these processes will probably include some light data work to format this dataset so that you can produce useful insights: [statistical analyses](/en/lessons/data-wrangling-and-management-in-r), [maps](/en/lessons/mapping-with-python-leaflet), [social network analyses](/en/lessons/exploring-and-analyzing-network-data-with-python), [discourse analyses](/en/lessons/corpus-analysis-with-antconc). But regardless of where you go from here, you have a pretty robust dataset that can be used for a variety of academic pursuits.
 
 You might have noticed we didn't get any latitude/longitude location information, but we did get a "place" column with less exact, textualized location information.  Non-coordinate location data needs to be [geocoded](https://en.wikipedia.org/wiki/Geocode), which in this case means using a geocoder to [geoparse](https://en.wikipedia.org/wiki/Toponym_Resolution#Geoparsing) the reported locations and assign lat/long values to them. Different programs do this to greater or lesser success.  [Tableau](https://www.tableau.com), for instance, has a hard time interpolating a set of locations if it's not at a consistent geographical level (city, state, etc.). For that reason, I generated latitude and longitude information with the Google geocoder following this *Programming Historian* [lesson](/en/lessons/mapping-with-python-leaflet), and then inputted that information into Tableau for mapping. There's plenty of good mapping [tools](https://digitalfellows.commons.gc.cuny.edu/2019/06/03/finding-the-right-tools-for-mapping/) out there that you can feel free to use: the key here is getting specific, accurate location information from the list of place names in the dataset.
 

diff --git a/...ons/data-wrangling-and-management-in-R.md → ...ons/data-wrangling-and-management-in-r.md b/...ons/data-wrangling-and-management-in-R.md → ...ons/data-wrangling-and-management-in-r.md
@@ -1,6 +1,6 @@
 ---
 title: Data Wrangling and Management in R
-slug: data-wrangling-and-management-in-R
+slug: data-wrangling-and-management-in-r
 layout: lesson
 collection: lessons
 authors:
@@ -126,7 +126,7 @@ An Example of dplyr in Action
 Let's go through an example to see how dplyr can aid us as historians by
 inputting U.S. decennial census data from 1790 to 2010. Download the
 data by [clicking
-here](/assets/introductory_state_example.csv)
+here](/assets/data-wrangling-and-management-in-r/introductory_state_example.csv)
 and place it in the folder that you will use to work through the examples
 in this tutorial.
 
@@ -164,7 +164,7 @@ time.
       geom_line() +
       geom_point()
 
-{% include figure.html filename="en-or-data-wrangling-and-management-in-R-01.png" caption="Graph of California and New York population" %}
+{% include figure.html filename="en-or-data-wrangling-and-management-in-r-01.png" caption="Graph of California and New York population" %}
 
 As we can see, the population of California has grown considerably
 compared to New York. While this particular example may seem obvious
@@ -182,7 +182,7 @@ with two different states such as Mississippi and Virginia.
       geom_line() +
       geom_point()
 
-{% include figure.html filename="en-or-data-wrangling-and-management-in-R-02.png" caption="Graph of Mississippi and Virginia population" %}
+{% include figure.html filename="en-or-data-wrangling-and-management-in-r-02.png" caption="Graph of Mississippi and Virginia population" %}
 
 Quickly making changes to our code and reanalyzing our data is a
 fundamental part of exploratory data analysis (EDA). Rather than trying
@@ -579,7 +579,7 @@ colleges founded before the U.S. War of 1812:
       geom_bar(aes(x=is_secular, fill=is_secular))+
       labs(x="Is the college secular?")
 
-{% include figure.html filename="en-or-data-wrangling-and-management-in-R-03.png" caption="Number of secular and non-secular colleges before War of 1812" %}
+{% include figure.html filename="en-or-data-wrangling-and-management-in-r-03.png" caption="Number of secular and non-secular colleges before War of 1812" %}
 
 Again, by making a quick change to our code, we can also look at the
 number of secular versus non-secular colleges founded after the start of
@@ -593,7 +593,7 @@ the War of 1812:
       geom_bar(aes(x=is_secular, fill=is_secular))+
       labs(x="Is the college secular?")
 
-({% include figure.html filename="en-or-data-wrangling-and-management-in-R-04.png" caption="Number of secular and non-secular colleges after War of 1812" %}
+({% include figure.html filename="en-or-data-wrangling-and-management-in-r-04.png" caption="Number of secular and non-secular colleges after War of 1812" %}
 
 Conclusion
 ==========

diff --git a/en/lessons/geospatial-data-analysis.md b/en/lessons/geospatial-data-analysis.md
@@ -174,7 +174,7 @@ Now we have a large dataframe called `County_Aggregate_Data` which has our count
 ```r
 religion <- read.csv("./data/Religion/Churches.csv", as.is=TRUE)
 ```
-Depending on the state of the data you may need to do some data transformations in order to merge it back with the DataFrame. For complex transformations, see tutorials in R on working with data such as [Data Wrangling and Management in R tutorial](/en/lessons/data-wrangling-and-management-in-R) [data transforms](http://r4ds.had.co.nz/transform.html). In essence, you need to have a common field in both datasets to merge upon. Often this is a geographic id for the county and state represented by `GEOID`. It could also be the unique FIPS Code given by the US Census. Below I am using state and county `GEOID`. In this example, we are converting one data frame's common fields to numeric so that they match the variable type of the other dataframe:
+Depending on the state of the data you may need to do some data transformations in order to merge it back with the DataFrame. For complex transformations, see tutorials in R on working with data such as [Data Wrangling and Management in R tutorial](/en/lessons/data-wrangling-and-management-in-r) [data transforms](http://r4ds.had.co.nz/transform.html). In essence, you need to have a common field in both datasets to merge upon. Often this is a geographic id for the county and state represented by `GEOID`. It could also be the unique FIPS Code given by the US Census. Below I am using state and county `GEOID`. In this example, we are converting one data frame's common fields to numeric so that they match the variable type of the other dataframe:
 
 ```r
 religion$STATEFP <- religion$STATE

diff --git a/en/lessons/sentiment-analysis-syuzhet.md b/en/lessons/sentiment-analysis-syuzhet.md
@@ -41,7 +41,7 @@ Although the lesson is not intended for advanced R users, it is expected that yo
 
 * Taylor Arnold and Lauren Tilton, '[Basic Text Processing in R](/en/lessons/basic-text-processing-in-r)', *Programming Historian* 6 (2017), https://doi.org/10.46430/phen0061
 * Taryn Dewar, '[R Basics with Tabular Data](/en/lessons/r-basics-with-tabular-data)', *Programming Historian* 5 (2016), https://doi.org/10.46430/phen0056 
-* Nabeel Siddiqui, '[Data Wrangling and Management in R](/en/lessons/data-wrangling-and-management-in-R)', *Programming Historian* 6 (2017), https://doi.org/10.46430/phen0063 
+* Nabeel Siddiqui, '[Data Wrangling and Management in R](/en/lessons/data-wrangling-and-management-in-r)', *Programming Historian* 6 (2017), https://doi.org/10.46430/phen0063 
 
 You may also be interested in other sentiment analysis lessons:
 

diff --git a/en/lessons/shiny-leaflet-newspaper-map-tutorial.md b/en/lessons/shiny-leaflet-newspaper-map-tutorial.md
@@ -36,7 +36,7 @@ In this lesson, you will learn:
 -   The concept and practice of 'reactive programming', as implemented by Shiny applications. Specifically, you'll learn how you can use Shiny to 'listen' for certain inputs, and how they are connected to outputs displayed in your app.
 
 <div class="alert alert-info">
-Note that this lesson doesn't teach any coding in R, other than what's necessary to create the web application, nor does it cover publishing the finished application to the web. A basic knowledge of R, particularly using the <a href='/en/lessons/data-wrangling-and-management-in-R'>tidyverse</a>, would be very useful.
+Note that this lesson doesn't teach any coding in R, other than what's necessary to create the web application, nor does it cover publishing the finished application to the web. A basic knowledge of R, particularly using the <a href='/en/lessons/data-wrangling-and-management-in-r'>tidyverse</a>, would be very useful.
 </div>
 
 ### Graphical User Interfaces and the Digital Humanities  
@@ -108,7 +108,7 @@ First, however, you need to set up the correct programming environment and creat
 
 To get started with this tutorial, you should install the latest versions of [R](https://cran.rstudio.com/) and [Rstudio](https://www.rstudio.com/products/rstudio/download/) on your local machine. The R programming language has a very popular IDE (Integrated Development Environment) called RStudio, which is often used alongside R, as it provides a large set of features to make coding in the language more convenient. We'll use RStudio throughout the lesson.
 
-Previous *Programming Historian* lessons have covered [working with R](/en/lessons/r-basics-with-tabular-data) and [working with the tidyverse](/en/lessons/data-wrangling-and-management-in-R). It would be useful to go through these lessons beforehand, to learn the basics of installing R and using the tidyverse for data wrangling.
+Previous *Programming Historian* lessons have covered [working with R](/en/lessons/r-basics-with-tabular-data) and [working with the tidyverse](/en/lessons/data-wrangling-and-management-in-r). It would be useful to go through these lessons beforehand, to learn the basics of installing R and using the tidyverse for data wrangling.
 
 ### Create a new RStudio Project  
 

diff --git a/es/lecciones/administracion-de-datos-en-r.md b/es/lecciones/administracion-de-datos-en-r.md
@@ -18,7 +18,7 @@ translation-reviewer:
 - Victor Gayol
 review-ticket: https://github.com/programminghistorian/ph-submissions/issues/199
 layout: lesson
-original: data-wrangling-and-management-in-R
+original: data-wrangling-and-management-in-r
 difficulty: 2
 activity: transforming
 topics: [data-manipulation, data-management, distant-reading, r, data-visualization]
@@ -78,7 +78,7 @@ Copia el siguiente código en R Studio. Para ejecutarlo tienes que marcar las l
 ```
 
 ## Un ejemplo de dplyr en acción
-Veamos un ejemplo de cómo dyplr nos puede ayudar a los historiadores. Vamos a cargar los datos del censo decenal de 1790 a 2010 de Estados Unidos. Descarga los datos haciendo [click aquí](/assets/ejemplo_introductorio_estados.csv)[^2] y ponlos en la carpeta que vas a utilizar para trabajar en los ejemplos de este tutorial.
+Veamos un ejemplo de cómo dyplr nos puede ayudar a los historiadores. Vamos a cargar los datos del censo decenal de 1790 a 2010 de Estados Unidos. Descarga los datos haciendo [click aquí](/assets/data-wrangling-and-management-in-r/ejemplo_introductorio_estados.csv)[^2] y ponlos en la carpeta que vas a utilizar para trabajar en los ejemplos de este tutorial.
 
 Como los datos están en un archivo CSV, vamos a usar el comando de lectura ```read_csv()``` en el paquete [readr](https://cran.r-project.org/web/packages/readr/vignettes/readr.html) de "tidyverse".