Welcome to The Carpentries Etherpad! This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents. Use of this service is restricted to members of The Carpentries community; this is not for general purpose use (for that, try etherpad.wikimedia.org). Users are expected to follow our code of conduct: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ Workshop Twitter hashtag: #LibCarpMoHub Slack: https://missourihub.slack.com/ ## Agenda ### Day 1 * Introductions * Workshop Instructors and Helpers * Missouri Hub * Introduce yourself to two neighbors * Logistics * About Library Carpentry * Introduction to Working with Data (Regular Expressions) * The UNIX Shell ### Day 2 * Logistics reminder * Introduction to Git * OpenRefine ## Attending Anna Oates/Federal Reserve Bank of St Louis/Anna.Oates@stls.frb.org/@annaoates Heather Moulaison Sandy - University of Missouri iSchool - moulaisonhe@missouri.edu Tori Lyons/Logan University/victoria.lyons@logan.edu Brianna Chatmon/Stephens/University of Missouri/bctkd@mail.missouri.edu Shannon Mawhiney/Missouri State University/smawhiney@missouristate.edu Carol Clark/Saint Louis Art Museum/carol.clark@slam.org/@GLAMdatacarol AJ Robinson/ Washington University/ robinson.a@wustl.edu Jessica Kleekamp / Washington University in St. Louis / jkleekamp@wustl.edu Marcy Vana / Washington University School of Medicine / vanam@wustl.edu Jenny Bossaller/University of Missouri/bossallerj@missouri.edu Drew Kupsky - Saint Louis University - drew.kupsky@slu.edu Stephanie Chinn / Missouri Univeristy of Science & Technology / garvin@mst.edu Jamillah Boyd University of Missouri St. Louis @jamillahboyd Maze Ndukum / Washington University School of Medicine / ndukummaze@wustl.edu Dylan Martin / Lincoln University of Missouri / martind2@lincolnu.edu todd quinn / University of New Mexico Levi Dolan / University of Missouri / ljd437@mail.missouri.edu Anne Cox/State Historical Society of Missouri/coxan@shsmo.org/@woozymoose Daron Dierkes / The Missouri Historical Society / ddierkes@mohistory.org Chris Sorensen/Washington University School of Medicine/sorensenc@wustl.edu Matt Butler / Missouri State Library / matthew.butler@sos.mo.gov Evan Sprague /Washington University School of Medicine / e Dorris Scott/ Washington University in St. Louis/d.scott@wustl.edu/@Dorris_Scott Amanda Sprochi/MU/ sprochia@health.missouri.edu Katherine Leonard / University of Missouri / knln9c@mail.missouri.edu Feliciy Dykas / University of Missouri - dykasf@missouri.edu Dave LaCrone / Kansas City Public Liibrary / davidlacrone@kclibrary.org Emily Stenberg / Washington University in St. Louis / emily.stenberg@wustl.edu ## Introduction to Working with Data (Regular Expressions) Used in programming to match patterns (and replace) "Finding a needle in a haystack" Often used with text/code You can plug it in and use it in text/code editors, scripts, OpenRefine! (Another that is often cited, brackets.io) Also known as RegEx or regex We will start off using: https://regexr.com/ > Cheatsheet (let's first walk through this) [A-Z] this is a range, will match characters between - and including A & Z - but capital letters only [a-z] range will match characters between and including, a- z, lower case \w matches any word character (alpa & underscore) + match one or more of the proceding token (\w in this case) \d matches a digit \s matches white space \s{2,} matches two or more spaces \b\w{5}\b matches words with 5 characters .* matches 0 or more times .*? matches 0 or 1 times So, what is ^[Oo]rgani.e\b going to match? At the start of a line Organize or organize or Organise or organise Organize, organize, organise, Organise Organize, Organize organice Organize with lower or upper case beginning in either US English or otherwise, or any other character to make a nonsense word. Must begin with all characters in the RegEx and end with the boundary Organize, organize, Organipe, organibe... (the ^ means the line must start with O or o) Lines that start with organize spelled with any character where the z is (Upper of lowercase o) Organize upper or lower case at beginning of line spelled in British or Amerian English Organize, organize, Organise, organise What will the regular expression ^[Oo]rgani.e\w* match Organize organizen organizea Organized, organisers At the start of a line Organizes, Organized, organized or organises, organiked, organike organibe234234jweljaltjq234oj2oqfjoaw4vnq3o Start of the line has O or o followed by rgani, any character, anything after e Organizedly Starting with O or o, matching rgani, any character, matching e, with any zero or more characters after organise343443435353353 What will the regular expression \b[Oo]rgani.e\b|\b[Oo]rgani.e\w{1}\b match? Start of a line Organi/organi, seventh character anything, followed by an e, OR Orangile organize or organizer, but not organizers Organize upper or lower case British or American spelling or organiz/ser/d/ something Starts with entire expression, leading with O or o, matching rgani, any character, ending with e OR Starts with entire expression, leading with O or o, matching rgani, has e, ending on boundary matching one of any character Find all of the words starting with Comm or comm that are plural. [Cc]omm\w*s\b \b[Cc]omm\w*s\b \b[Cc]omm\w*s\b \b[Cc]omm\w+s\b b[Cc]omm.*s\b (this includes spaces) = 23 results Isolating email addresses from the Software Carpentry Code of Conduct \w*@\w*\.[a-z]* \b\w+@\w+\.\w+\b \b\w*@\w*\.\w*\b \b\w*@\w*\.\w{3}\b \b\w+@ (missed part of one) \b\w+@\w+\.\w+\b \b\w+@\w+\.[A-Za-z]+ \w+\@\w+\.\w+ @\w*\. [\w]+@ \w+@\w+\.... \w+@\w*\. \w+@\w+\.\w+\b \w| \b\w*@\w*\.\w*\b [\w]@ NOTE ON THE CHEAT SHEETS Using parenthesis around an expression turns it into a group. When using regex to replace strings, you can group the strings your are matching, and then reference the groups by the dollar sign. You'll see an example of this in the markdown-to-html example. Examples of ways used: In OpenRefine to change formatting in a column on shell, looping to change file names MarcEdit to fix/standardize wonky date formats To find and replace very specific strings Another helpful regular expressions cheatsheet: http://www.cbs.dtu.dk/courses/27610/regular-expressions-cheat-sheet-v2.pdf For our exercises, we will use The Carpentries Code of Conduct: https://raw.githubusercontent.com/libcce/lc-lesson-materials/master/code-of-conduct.md (copy to regexr) Ctrl-A to select all, Ctrl-C to copy Regular expression challenges using The Carpentries Code of Conduct: 1. Find different spellings of organize 2. Change the Markdown links to HTML links/tags 3. Change the bold Markdown headings (with HTML break tags) to HTML heading tags 4. Change the Markdown headings to HTML headings Start here as group: - Switch the ISO dates to USA format (dd-mm-yyyy) Expression: (\d{4})-(\d{2})-(\d{2}) Replace with: $2-$3-$1 RegEx Exercise - Possible Answers - Find different spellings of organize organi[zs]\w+ \organi.e (organi).(e) and then to replace the s or z <<$1z$2>>, this way leaves the endings (-er) untouched - Change the Markdown links to HTML links/tags? [link text](https://www.google.com) (\[)(.*?)(\]\s*\()(.*?)(\)) $2 OR (\[.*\])(\(.*\)) $1 -Change Markdown break tags to HTML break tags (\*\*)(.*?)(\*\*) $2 - Change the Markdown headings to HTML headings (\#{2,4}\s*)(.*?)($)

$2

- Switch the ISO dates to USA format (\d{4})(\-)(\d{2})(\-)(\d{2}) $3/$5/$1 OR Expression: (\d{4})-(\d{2})-(\d{2}) Replace with: $2-$3-$1 \* matches the astricks (*) - need to escape it with back slash "\" Exercise 2 Go here https://www.thetranscriptionpeople.com.au/2015/04/14/a-humorous-look-at-how-punctuation-can-change-meaning/ and copy the main block of text. Practice making all of the sentences either with the Oxford comma, or all of them without More info/exercises: https://librarycarpentry.github.io/lc-data-intro/04-regular-expressions/index.html You might have found this helpful if I had thought to paste this in beforehand: Regular Expressions Cheat Sheet https://docs.google.com/document/d/1P-LAtXb5S8F_tIE9TQOT2qP5ItEwMnR81AxK519T5Ig/edit *## The UNIX Shell *### Data Files You need to download some files to follow this lesson: https://raw.githubusercontent.com/librarycarpentry/lc-shell/gh-pages/data/shell-lesson.zip 1. Download shell-lesson.zip and move the file to your Desktop. 2. Unzip/extract the file (ask your instructor if you need help with this step). You should end up with a new folder called shell-lesson on your Desktop. Note: Something to look out for, sometimes there are issues in getting to the shell lesson folder via Windows Git Bash Go to directory, right click, open/select Git Bash or cd /c/... First, if you were unable to install Git Bash then try this: https://console.cloud.google.com/cloudshell/editor?shellonly=true&pli=1 wget github.com/LibraryCarpentry/lc-shell/raw/gh-pages/data/shell-lesson.zip unzip shell-lesson.zip *### About the Shell Before we had graphical interfaces, we had command line interface UNIX Shell began in 1970s Historical flowchart: https://en.wikipedia.org/wiki/File:Unix_history-simple.svg Futher reading The unix shell: https://en.wikipedia.org/wiki/Unix_shell Unix-like: https://en.wikipedia.org/wiki/Unix-like What is unix?: A brief introduction to unix: https://www.softwaretestinghelp.com/unix-introduction/ A beginner's guide to the unix command line: https://www.osc.edu/supercomputing/unix-cmds Command line crash course: https://learnpythonthehardway.org/book/appendixa.html Use cases - Programming, data science work, research computing - Wrangling with and cleaning lots of data/files - Example: ORCID data dump via Figshare https://orcid.figshare.com/articles/ORCID_Public_Data_File_2018/7234028 - Example: Mining journal article PDFs at the European Southern Observatory https://www.eso.org/sci/libraries/telbib_methodology.html HELP - `man(ls` - `help(ls)` - Google it - Explain Shell: https://explainshell.com/explain?cmd=ls+-lh - TL;DR: https://tldr.sh/ - Basic UNIX Commands: https://www.tjhsst.edu/~dhyatt/superap/unixcmd.html - For Windows: http:/man.he.net/ - `COMMAND --help` - Example: `ls --help` TEXT EDITORS - nano - default - Notepad++ (Windows): https://notepad-plus-plus.org/ - If using Notepad ++, here is a cheatsheet for keyboard shortcuts: http://www.cheat-sheets.org/saved-copy/Notepad++_Cheat_Sheet.pdf - Sublime (macOS, Linux): https://www.sublimetext.com/ - Atom (macOS): https://atom.io/ - Editpad Pro (Windows): https://www.editpadpro.com/download.html - Visual Studio Code (Windows, macOS, Linux): https://code.visualstudio.com/ - IntelliJ (Windows, macOS, Linux): https://www.jetbrains.com/idea/ *### Working with files and directories https://drive.google.com/file/d/12N579z4FgK9yM3m4j0XBkQQxKdcJk62s/view?usp=sharing https://raw.githubusercontent.com/librarycarpentry/lc-shell/gh-pages/data/shell-lesson.zip KEY COMMANDS `pwd` - present working directory `cd` - change directory `ls` - list directory contents - helpful flags/options: `ls -l` and `ls -lh` Permissions / symbolic links / user / group / size / date / folder or filename `d` - stands for directory / chmod 777 (file) / extensible @ / ls -a (see hidden files) QUIZ * What flags do you use to list contents of a directory in long listing format and sort by modification date, newest first? * And how can you order by file size? * How can you see hidden files? ANSWERS `ls -lt` (order by mod date) `ls -lS` (order by file size) `ls -a` (do not ignore entries starting with .) `pwd` `mkdir firstdirectory` `cd firstdirectory` `cd ..` `ls -lh` `cat` - concatenate files and print on the standard output (in other words open and print a file to screen) type `82 + [TAB]` `cat 829-0.txt` QUIZ * What is the title of 829-0? cp gulliver.txt gulliver-backup.txt cp gulliver.txt gulliver_backup.txt ANSWER GULLIVER’S TRAVELS `head` - output the first part of files (first 10 lines) `head 829-0.txt` `tail` - output the last part of files (last 10 lines) `tail 829-0.txt` QUIZ TASK: Create a for loop that prints the name, first line, last line of each text (.txt) file in the current directory. for file in *.txt; do; echo "$file"; head -n *.txt; tail -n 1 *.txt;done for filename in *.txt; do echo $filename; head -n 1 $filename; tail -n 1 $filename; done for file in *.txt; do echo "$file"; head -n 1 $file; tail -n 1 $file; done for file in *.txt do ; echo "$file" ; head -n 1 $file ; tail -n 1 $file ; done for file in *.txt; do echo "$file"; head -n 1 "$file"; tail -n 1 "$file"; done * How can you return the first 20 lines of 829-0.txt? * How can you return the last 30 lines of 829-0.txt? ANSWER `head -n20 829-0.txt` `tail -n30 829-0.txt` Example: Sometimes files are too big to open and head and tail can be a lightweight way to peak inside or to get header information in automated way. `less` - allows you to scroll/page through file `less 829-0.txt` Navigating output `spacebar` to page, up and down arrows, `q` to quit `mv` - move (rename) files `mv 829-0.txt gulliver.txt` QUIZ What is the title of 33504-0.txt and can you rename it to its title.txt? ANSWER Opticks and mv opticks.txt mv renames a file, cp copies a file and places to new file name whereve you want it to go QUIZ Can you create backup files of the two titles above in a "backup" folder naming the files by adding "_backup.txt"? ANSWER mkdir backup cp gulliver.txt backup/gulliver_backup.txt cp opticks.txt backup/opticks_backup.txt Wildcards What does * do? QUIZ How can we use this wildcard to match + list all the .txt files? ANSWER ls *.txt How can you see the history of your commands? - You can use the up and down arrow keys - You can use history history !number to print out specific command You can also redirect output of your history to a text file history > history.txt For a taste of Shell programming, let's create a variable which holds a value: NAME=Groot And let's print out to the command line: echo "I am $NAME" <<>> Create a number of files quickly using touch touch a.txt b.txt c.txt d.txt Now for the scripting! We will create a Bash script on the command line For our script we are going to loop through all the text files And we are going to print the file name Then we are going to finish $ for filename in *.txt > do > echo $filename > done Before our exercise, how can we create and edit a file on the command line? Let's use the command line tool "nano." You can use an alternative text editor, such as Notepad++, Sublime, Atom, etc. nano myfile.txt Editor screen appears Write "This is my file!" Ctrl + X to close Enter to save file Exit w/ "y" or "yes" EXERCISE - Create a file called myscript.sh - Add a similar for loop to the myscript file - In this loop you will print the file name to screen - And you will print the first and last 5 lines of the file to screen - Then you will end the loop Note: Add... #!/bin/bash # My first script ... to the top of the file ANSWER: myscript.sh #!/bin/bash # My first script for filename in *.txt do echo $filename head -n 5 $filename tail -n 5 $filename done Run the executable: ./myscript.sh OR bash myscript.sh To make file executable, run `chmod +x FILENAME` ABOUT #! https://www.in-ulm.de/~mascheck/various/shebang/ TIP: Ctrl + C to quit when in infinite loop Navigate to shell-lesson dir `ls -lh` `wc` - word count `wc *.tsv` (see words and lines) QUIZ What options are available to you in wc? `wc -l *.tsv > lengths.txt` `cat lengths.txt` Piping Two or more commands connect together via a "|" Order is -> | -> | ... `wc -l *.tsv | sort` `wc -l *.tsv | sort -r` ------- EXERCISES We have our wc -l *.tsv | sort -n | head -n 1 pipeline. What would happen if you piped this into cat? wc -l *.tsv | sort -n | head -n 1 | cat 5375 2014-02-02_JA-britain.tsv *wc -l *.tsv | sort -n | head -n 1 -w == words - can be used to get the word count for words (e.g., wc -w FILENAME) Know the 10 files that contain the most words. wc -w * | sort -n | head -n 10 wc -w *.tsv | sort -nr | head -n 11 wc -w * | sort -n | tail -n 11 wc -w *.tsv | sort -nr | head -n 11 wc -w *.tsv | sort -n | tail -n 11 wc -w *.tsv | sort -n | tail -n 11 -c - flag to append to grep for counting number of instances -o - flag to limit return to exact string match ; not the complete line which contains a match 1. Search for all case sensitive instances of a word of your choice in all derived .tsv files. Print your results to the shell. grep -wc speculation *.tsv grep -w Egypt *.tsv grep -w Dakota *.tsv grep -w history *.tsv grep -w Foo *.tsv grep -w Washington *.tsv grep -w social *.tsv grep -w Africa *.tsv grep -wc battle *.tsv $ grep -w India *.tsv $ grep -wc history 2014-01-31_JA-africa.tsv grep -w war *.tsv grep -w colonial *.tsv grep -w virgina *.tsv 2. Move up to the shell-lesson directory. Search for all case sensitive instances of a word of your choice in the "America" and "Africa" .tsv files in the shell-lesson directory. Print you results to the shell. grep -w colonial *a.tsv grep -w Dakota 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -w history 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -w speculation 2014-01-31_JA-america.tsv 2014-01-31_JA-africa.tsv grep -w social 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -w Washington 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -w war 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv $ grep -w water 2014-01-31_JAfrica.tsv 2014-01-31_JA-america.tsv grep -w slavery *a.tsv 3. Count all case sensitive instances of a word of your choice in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell. grep -c bear 2014-01-31_JA-a*.tsv grep -wc colonial *a.tsv grep -cw Dakota 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wc Washington 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -c speculation 2014-01-31_JA-america.tsv 2014-01-31_JA-africa.tsv grep -c bear 2014-01-31_JA-a*.tsv grep -wc social 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wc history 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -ci Connecticut *a.tsv grep -c war 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wc Bird 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv $ grep -wc fruit 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv 4. Count all case insensitive instances of that word in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to the shell. grep -cwi Dakota 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -cwi speculation 2014-01-31_JA-america.tsv 2014-01-31_JA-africa.tsv grep -wci colonial *a.tsv grep -wci social 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wci history 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wci washington 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -wci Bird 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -cwi war 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv grep -cwi Nevada 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv 5. Search for all case insensitive instances of that word in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to a file results/[wordOfChoice].tsv. grep -wi Dakota 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/Dakota.tsv grep -wi colonial *a.tsv > results/colonial.tsv grep -i social 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/social.tsv grep -iw speculation 2014-01-31_JA-america.tsv 2014-01-31_JA-africa.tsv > results/speculation.tsv grep -i history *a.tsv > results/history.tsv grep -wi Dakota 2014-01-31_JA-africa.tsv *a.tsv > Dakota-i.tsv grep -wi history 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/history.tsv grep -wi washington 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/Washington.tsv grep -wi Bird 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > ./results/Bird.txt grep -wi war 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/war.tsv grep -wi Nevada 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv 6. Search for all case insensitive instances of that whole word in the ‘America’ and ‘Africa’ .tsv files in this directory. Print your results to a file results/[wordOfChoice]-i.tsv. grep -wi Dakota 2014-01-31_JA-africa.tsv *a.tsv > Dakota-i.tsv grep -wi history 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/history.tsv grep -wi \bwashington\b 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/Washington.tsv grep -i war 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv > results/war-i.tsv *$ grep -iw war *a.tsv > results/war-i.tsv grep -wi social *a.tsv > results/social-i.tsv grep -wi Nevada 2014-01-31_JA-africa.tsv 2014-01-31_JA-america.tsv >results/Nevada.tsv 7. Use regular expressions to find all ISSN numbers (four digits followed by hyphen followed by four digits) in 2014-01_JA.tsv and print the results to a file results/issns.tsv. Note that you might have to use the -E flag (or -P with some versions of grep, e.g. with Git Bash on Windows.). grep -E '\d{4}-\d{4}' 2014-01_JA.tsv > ./results/issns.tsv (because apparently \d{#} is an extended regular expression, and grep needs -E to recognize it) grep -P '\d{4}-\d{4}' 2014-01_JA.tsv > results/issns.tsv grep -P '\d{4}-\d{4}' 2014-01_JA.tsv > results/issns.tsv (-P recognizes the pattern of digits xxxx-xxxx) grep -P '\d{4}-\d{4}' 2014-01_JA.tsv > results/issns.tsv grep -E '\d{4}-\d{4}' 2014-01_JA.tsv > results/issns.tsv grep -E '\d{4}-\d{4}' 2014-01_JA.tsv > results/issns.tsv grep -P '\d{4}\-\d{4}' 2014-01_JA.tsv > results/issns.tsv QUIZ How would you get the file with the lowest number of lines? And how can you save that to a txt file? ANSWER TIP: If you wanted to append a date stamp to the top of the file you just created: date >> topsort.txt `grep` - print lines matching a pattern grep is probably one of the most useful command line tools for searching for matches within files/directories `grep 1999 *.tsv` By the way, how could we have redirected the output to a txt file? `grep -c 1999 *.tsv` (lists number of matches per file) `grep -c revolution *.tsv`(lists case sensitive search) `grep -ci revolution *.tsv` (lists insensitive) ## Try other keywords here like America or German Here is an example that I used at the European Southern Observatory to find instrument names in context: `grep -C 2 'HARPS' *` (get two lines for context of match) QUIZ How would you search for the China journal in the same files? ANSWER `grep -iwE 'fr[ae]nc[eh]' *.tsv` (flags i insensitive, w word, E expression) ## Try to find variations of organize How can we tell if the number of matches has changed with our regex? `grep -o 'needle' haystack | wc -l` EXERCISE How would we find issns in 014-01_JA.tsv using grep, regex, and redirect output to a issns.tsv file? We can walk through this together and write it on the board... ANSWER EXERCISE Combine what you learned of the for loop with using grep to find the word counts of names in gulliver.txt... ANSWER Quick demo of sed which allows you to replace words in file: less diary.html Look inside and replace foo with bar (some word) sed -i '' 's/Daddy/Mommy/g' diary.html *## Introduction to Git Why would you want to learn this? Examples: Democratic databases: science on GitHub Scientists are turning to a software–development site to share data and code. https://www.nature.com/news/democratic-databases-science-on-github-1.20719 Making Code Citeable https://guides.github.com/activities/citable-code/ Journal of Open Source Software https://joss.theoj.org/ Our path to better science in less time using open data science tools https://www.nature.com/articles/s41559-017-0160 FAIR Data Action Plan (Issues for Community Feedback) https://github.com/FAIR-Data-EG/Action-Plan/issues Code4Lib Community Statement in Support of Chris Bourg https://github.com/code4lib/c4l18-keynote-statement Conference Websites https://libcce.github.io/TriangleJupyter/ Library Carpentry Lessons https://github.com/LibraryCarpentry Carpentries Workshop & Lesson Templates https://github.com/carpentries/workshop-template https://carpentries.github.io/lesson-example/setup.html ### Go to GitHub https://github.com/ Can search all public github repositories. ### Sign up (create an account) Once you've done this, copy and paste the link to your account here: << For example, mine is https://github.com/annaoates - Anna Oates>> https://github.com/ddierkes - Daron Dierkes, Missouri Historical Society https://github.com/lacrone - Dave LaCrone https://github.com/emrsster - Emily Stenberg, WashU https://github.com/drewkupsky - Drew Kupsky, Saint Louis University https://github.com/brichatmon/KC -Brianna Chatmon, Stephens College, SISLT Student, University of Missouri https://github.com/leonardstl - Katherine Leonard, University of Missouri https://github.com/firbolg - Levi Dolan https://github.com/genreina - Shannon Mawhiney, MSU https://github.com/annecox - Anne Cox, SHSMO https://github.com/dykasf/ - Felicity Dykas, U of Missouri https://github.com/carolaclark - Carol Clark, Saint Louis Art Museum https://github.com/butlermt - Matt Butler, Missouri State Library https://github.com/asprochi Amanda Sprochi MU https://github.com/bossjen - Jenny Bossaller, University of MO https://github.com/hlmoulaison heather moulaison sandy https://github.com/dmart423 - Dylan Martin, Lincoln University of Missouri https://github.com/jkleekamp - Jessica Kleekamp, Washington University in St. Louis https://github.com/stephchinn - Stephanie Chinn, Missouri University of Science & Technology https://github.com/cjsorensen15 - Chris Sorensen, Washington University School of Medicine https://github.com/deniceadkins Denice Adkins, School of Information Science & Learning Technologies, University of Missouri https://github.com/vanamarc - Marcy Vana, Washington University School of Medicine https://github.com/momiji15 - Dorris Scott, Washington University in St. Louis https://github.com/mightylibrarian - Todd Quinn, University of New Mexico https://github.com/EvanSprague - Evan Sprague, Washington University School of Medicine https://github.com/robinsonaj AJ Robinson, Washington University https://github.com/ndukumm, Maze Ndukum, Washington Unversity School of Medicine https://github.com/toribethlyons Tori Lyons, Logan University https://github.com/jamillahboyd Jamillah Boyd, University of Missouri St. Louis Readme file is the first thing one sees when they find your repository. Readme files should give enough information about your repository contents. *#### Connect to local `git config --global user.name "Your Name"` `git config --global user.email "your@email"` *git config --global core.editor "nano -w" https://help.github.com/en/github/getting-started-with-github/set-up-git ### Using Git `mkdir REPONAME` `cd REPONAME` `git init` `git status` `git add FILENAME` `git commit -m 'ADDMESSAGE` https ssh `git remote add origin GITHUBURL` `git remote -v` `git push -u origin master` `git diff` `git log` `git push` `git pull` git commands cheatsheet : https://education.github.com/git-cheat-sheet-education.pdf ### Post-it exercise (forking, branching, merging w/ cats) https://guides.github.com/introduction/flow/ forking - creating a copy in your github account branching - users who forked a repo-make changes in their account on this new branch pull request - ask to add your changes to the original repository merging - move the changes from the branches to the original repository Readme file is the first thing one sees when they find your repository. Readme files should give enough information about your repository contents. #### GitHub Pages `git checkout -b gh pages` `git push --set-upstream origin gh-pages` Same as: `git push -u origin master` GitHub Pages 
Navigate to REPONAME Settings Go to settings to turn on GitHub Pages Lots of Jekyll themes to choose from Can use the HackMD tool to explore: https://hackmd.io/ Share here: https://hackmd.io/s/rJmqomaTS Can edit in HackMD and cut/paste into github md files. Markdown cheat sheet: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet - Fork a repository from someone in the room (colleague github repos to chose from are above) Click the "Fork" button at the top to put a copy of the repo in your github - Create a Pull Request, which forks (makes a copy in your repo) to suggest a text/format change, that can be merged by the owner ## GitHub Pages Resources You can add a theme to spruce up your GitHub Page! Jekyll Themes: https://jekyllthemes.io/ Hugo Themeshttps://themes.gohugo.io/ ## GitHub Troubleshooting https://github.com/momiji15/gittingit/blob/master/howto.md *## OpenRefine Open Refine Cheat Sheet: https://docs.google.com/document/d/1RJVPyAChehfeVEd2DoltL6mm-N7NppHhUkha1Et2_gs/edit File you need to download: https://github.com/LibraryCarpentry/lc-open-refine/raw/gh-pages/data/doaj-article-sample.csv (In Safari, right click and select download linked file; in Chrome and Firefox, right click and select save link as) Remember that all of our instructions are available here: https://librarycarpentry.org/lc-open-refine/ ## Checklist * Logistics - restrooms, water fountain, emergency exits, emergency contact * Review Code of Conduct with learners (https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html) We are all learners! * Can you see the text (make it bigger?), can you hear us (speak louder?), can you slow down?, any other access questions? * Schedule: Intro to Data, Shell (day 1), Git, OpenRefine (day 2) https://libcce.github.io/2019-01-07-MTSU/ .(coffee breaks at 10:30 and 2:30, lunch at noon) * Remind learners to use sticky notes to give feedback * Get feedback at lunch and end of each day using sticky notes * Collect attendee names * Lessons are online at https://librarycarpentry.org/lessons/