- Benson | he/him
- Ann James | she/her | George Washington University in Washington DC
- Ernest Alema-Mensah, Morehouse School of Medicine
- Lyrric Jackson | Spelman College, she/her/hers
- Chuang Peng, /he/his/him, Morehouse College
- Michael Dillon, Morehouse College
- Lynne Patten, Clark Atlanta University
- Jessica Afangnibo, she/her, Morehouse School of Medicine
- Zakiya Barnes / she/ her/ Clark Atlanta University
- Yvonne Phillips / she/her/ Morehouse College
- Kenya Jones /she/her/Clark Atlanta University
- Britt Clark /she/her Atlanta University
- Renee/Clark Atlanta University
- Robin/she/her/Clark Atlanta University
- Youseung Kim/He/his/him/ Clark Atlanta University
- Issifu Harruna /Clark Atlanta University
- Mareena Pitts/Morehouse School of Medicine
- Shyheim Williams / he/him/his / Clark Atlanta University
- Oyebade Oyerinde/Clark Atlanta University
- Lakeshia Legette Jones (she/her) / Clark Atlanta University
- Ali Aljoda - Spelman College
- Tory /he/his/him/Clark Atlanta University
- Lanisha| She/her| Morehouse School of Medicine
- Tameyah Mathis-Perry she/her/hers Morehouse School of Medicine
- And take a moment to complete the pre-workshop survey if you've not yet had a chance to do so online at https://carpentries.typeform.com/to/wi32rS?slug=2023-06-20-aucenter-online
- Introduce yourselves and list terms that you find interesting or would like to learn more about:
- Ann's Breakout room:
- Machine Learning
- checking out, checking in, fetch
- python--the process of coding
- Jargon Busting Exercise:
- This exercise is an opportunity to gain a firmer grasp on the concepts around data, code or software development in libraries.
-
- Pair with a neighbor and decide who will take notes (or depending on the amount of time available for the exercise, skip to forming groups of four to six).
- Talk for three minutes (your instructor will be timing you!) on any terms, phrases, or ideas around code or software development in libraries that you’ve come across and perhaps feel you should know better.
- Next, get into groups of four to six.
- Make a list of all the problematic terms, phrases, and ideas each pair came up with. Retain duplicates.
- Identify common words as a starting point - spend 10 minutes working together to try to explain what the terms, phrases, or ideas on your list mean. Note: use both each other and the internet as a resource.
- Identify the terms your groups were able to explain as well as those you are still struggling with.
- Each group then reports back on one issue resolved by their group and one issue not resolved by their group.
- The instructor will collate these on a whiteboard and facilitate a discussion about what we will cover today and where you can go for help on those things we won’t cover. Any jargon or terms that will not be covered specifically are good notes.
- Group 1--Ann
- Git terminology such as checkout cherry picking and branching
- Working in an Unix enviroment
- Group 2
- Chuang Peng, how to connect to library data
- basic unix commands for navigating the environment
- git branching
- Group 3 - Benson
- AI
- Syntax
- Drug discovery
- kernel
- Unix Shell
- ssh
- Group 4- Lyrric
- Branching (git) and versions (problematic to manage). How to commit to code. Perhaps, talking about pros and cons of txt, json, csv, xls files as data repository.
- Cleaning data to form structure queries
- Knowledge about Unix Shell and SQL server. Git (I know Github, but don't know "Git").
- The Unix Shell
- Summary and Setup Instructions for Quick Reference: https://librarycarpentry.org/lc-shell/
- # NOTES, Command history
- $ pwd # present working directory
- $ ls
- $ ls -l # use a flag to get additional information
- $ ls -lh # get file sizes in better readable units
- $ ls-lh # the space is important
- $ ls -lh, # adding extra characters can give errors
- # ctrl+ and ctrl- to change font size
- $ cd Desktop # change directory to Desktop
- $ pwd # verify have changed directory
- $ cd "things to learn" # informed folder does not exist
- $ pwd # confirm location
- $ ls # list files and directories
- $ history # see previous commands
- # Getting help
- $ man ls # works on Mac and linux not GitBash
- $ ls --help # works on Windows GitBash
- $ ls -d # try one of the flags listed
- # https://stackoverflow.com/questions/14352290/listing-only-directories-using-ls-in-bash
- $ ls -d */ # list folders
- $ clear # remove ouput, clean screen
- $ history # previous commands typed in, even those that did not execute properly
- $ cd Desktop/shell-lesson # go into data directory that was downloaded
- # available at https://librarycarpentry.org/lc-shell/data/shell-lesson.zip
- $ mkdir firstdir # make a directory called firstdir
- $ cd firstdir # go into directory
- $ pwd # get path
- $ cd .. # move up a level
- $ pwd # get path
- $ cd .. # move up another level
- $ cd shell-lesson # go back into the data directory
- $ cd first-directory # use tab after typing the first character to auto complete
- $ cd .. # go back into shell lesson
- $ ls -lh # check files included
- # on Mac, may need to change privacy settings to allow the terminal to have full disk access, and then restart terminal
- $ cat 829-0.txt # examine text file content
- $ head 829-0.txt # look at first 6 lines of the document
- $ tail 829-0.txt # examine last 6 lines of the document
- $ tail 829-0.txt -n 3 # look at just last 4 lines, can use up arrow to get previous commands
- $ less 829-0.txt # scroll through content, use spacebar to get next page, type q to quit
- $ head 829-0.txt 33504-0.txt # Examine multiple files at one time
- $ head 829-0.txt 33504-0.txt -n 5 # Specify number of lines to show
- $ head *.txt # examine all text files in current directory
- $ mv 829-0.txt gulliver.txt # no output, mv - move
- $ ls # see that there is a file named gulliver.txt
- # can also copy
- $ cp gulliver.txt gulliver-backup.txt
- $ ls # see that there is now a backup copy
- $ head gulliver.txt gulliver-backup.txt # check they are similar
- $ mv firstdir backup
- $ ls # see that here is nothing
- $ cd .. # go up one directory
- $ pwd # check path
- $ ls # show files
- $ mkdir backup
- $ mv gulliver-backup.txt backup
- $ cd backup
- $ ls
- $ pwd
- $ cd ..
- $ echo 'hello world!' # output text to the terminal
- $ echo 'hi my name is ...'
- $ echo "$NAME is a fantastic library carpentry student" # variable not specified, so skipped
- $ NAME = Joe # fails
- $ NAME=ANN # seems to work
- $ echo '$NAME is a fantastic library carpentry student' # variable not output
- $ echo "$NAME is a fantastic library carpentry student" # variable output
- $ NAME = "Ann" # adding quotes worked in Windows for echoing in the name
- $ echo "Finally it is nice an sunny on" $(date)
- # if you type
- $ echo "
- # you will get
- >
- # where you can type more text
- # to exit either add qutation marks and press enter
- > "
- # or use ctrl + c
- # Be careful using rm as there is no Trash, it removes permanently
- $ touch a.txt b.txt c.txt # create some files
- $ ls # list files to check they have been created
- $ head a.txt # check contents
- # Loops
- $ for filename in ?.txt
- > do
- > echo "$filename"
- > cp "$filename" backup_"$filename"
- > done
- # can try copying
- $ for filename in ?.txt
- > do
- > echo "$filename"
- > cp "$filename" backup
- $ #! #
- # get example bash script
- # https://librarycarpentry.org/lc-shell/04-loops.html
- # https://librarycarpentry.org/lc-shell/files/my_first_bash_script.sh
- $ nano my_first_bash_script.sh # open the file, examine it, exit using ctrl x
- $ bash my_first_bash_script.sh # run the script
- $ ls -lhS # examine file sizes
- $ nano 2014-01_JA.tsv # examnine in the editor, exit using ctrl c
- $ cat 2014-01_JA.tsv # print it out o screen, takes a long time use ctrl c to stop output
- $ head -n 3 2014-01_JA.tsv # Examine first three lines, see metadata
- $ wc *.tsv # count lines words and bytes of listed .tsv files
- $ wc -l *.tsv # get number of lines in each .tsv file only
- $ wc -l *.tsv > lengths.txt # pipe output to a text file
Git and Git/Hub
github.com #sign in or sign up for an account
#create an organization #or you can skip to 'Create a repository'
#to create a new repository follow the below steps
- at the owner*/repository name* line -> (insert a name)
- select the 'public' button #others will be able to see your repository online
- select 'add a readME file'
- instructor selected "choose license" -> "MIT license" #see link below on reference materials on licenses
- select 'Create repository'
- # https://opensource.org/licenses/
#Collaborate and share materials
#Host a webpage
Settings
(on left side), see 'code and automation
- Select 'Pages' option
- On the "Pages - build and deployment" . See 'Branches'
- Change 'none' to 'main' #which branch to deploy from
Return to the repository main page. Choose 'Code" you will see '1 branch'
see
https://docs.github.com/en/pages/getting-started-with-github-pages/configuring-a-publishing-source-for-your-github-pages-site
Select 'ReadMe.md' #uses markdown language https://www.markdownguide.org/cheat-sheet
#the hastag allows for comments in the ReadMe file
#Type the folliowing indented text as shown in the READMe file:
- # (insert your repository name)
- # second level heading
- ### Third level heading
- **bold**
- *italic*
- #### Linux Shell Commands
- | Command | Explanation |
- |-- |-- |
- |ls | list files and directories |
- |mv | move command |
See the green button to the right -> Select 'commit changes' #you can also preview
in the 'extended description' box include details on adds/edits i.e 'add linux commands' #as a note for your changes
Click on repository name at top #takes you back to the main repository
# Collaborate with instructor (or owner of respository) by forking and making pull requests
Choose 'Improve this page' #bottom right of page to make edits to owners ReadMe
Choose AUC23 repository #this
Choose 'fork' button on top of page #copy of someone's repository to contribute to their work
Click on 'Create fork' #this forks (copys) to the instructor's or workshop attendee's repository
#Back on your page
Choose 'main' from the pull-down menu
Create a branch #as an FYI there are two types of branches
- #main branch
- #working branch
#Collaborate with the owner through a pull request
Click 'ReadMe' #edits the owner's file
Click on the pencil symbol
- #Add another LINUX command to the owner's document
See the green button to the right -> 'commit changes' #you can preview adds/edits
In the 'extended description' box include include details on edits 'added new linux commands' #as a note for your changes
To go back to the main repository, Click on repository name
Click 'Compare and pull request' #this opens a new pull request
Select 'Create pull request' #this will send in a request to the owner of the document
#As the owner of the document, now others who have collaborated with you on your project have sent a pull request
- You have a 'pull request' -> Click on button
- 'merged' #add in a note to the collaborator
- Code
- Commits #you can see changes that have been made
#From owner's respository - can see the forks
Select "Forks' from the top right menu
#can review changes from others edit
#can go to the conversation and comment to the contributor (sending a thank you is a positive response)
'Merge pull request'
Select '<> Code' (top left of the menu) #can see the new commands have been added to the owner's document
https://librarycarpentry.org/lc-git/05-github-pages.html
What can you use a GitHub pages site for
- Create a course website
- Create a resource hub for students
- Create notices/announcements interact with students and researchers
- Develop student group projects page
- Create a database to enhance teaching outcomes
- Faciliate a workshop and distribute materials
#Next exercise: Learning how to execute these commands from the "Command" line
- Why? #sharing scripts from the Shell in Git using command line
- Why? #using Git to manage your files
#Move over to the Terminal/UNIX Shell
$ cd .. #go up a directory on Desktop (OR the previous level to your shell-lesson folder OR stay or your Desktop)
$ mkdir git #make directory called git
$ ls
$cd git #go into the git directory
#set up git repository
$git config --list
$git config --global user.name " "
$git config --global user.email " "
$git config --list
$mkdir bash_script #create a directory called bash_script
$cd bash_script
$git init
$ls
$ls -a #look for everything in directory, you'll see the git
$ls .git
$nano loops.sh #creates a .sh file called loops
#|/bin/bash #using bash as the interpreter, type the following indented text:
- for filename in *.*
- do
- echo '$filename"
- done
CTRL O
CTRL X #exits
$git log #loops.sh is not there yet
$git add loops.sh
$git log #not there
$git status #created but not committed
$git commit -m "add loops example"
$git log #has history of what is present
#possible solution for MAC users getting Xcode errors (i.e. invalid active developer path solution
https://careerkarma.com/blog/python-nameerror-name-xrange-is-not-defined/#:~:text=Conclusion-,The%20xcrun%3A%20error%3A%20invalid%20active%20developer%20path%20(%2FLibrary%2F,install%20and%20install%20Xcode%20Tools
https://developer.apple.com/xcode/resources/
------
One up: I like the notes - easy to follow
One down: provide additional resources for other operating systems (for Mac users, Linux).
One Up One Down:
One Up: Learned how to use GIT to collaborate
One Up: I was able to set up my github, fork, and pull a request
One Down: Future workshops should record
One up: having access to the etherpad/instructions in real time, made it easy to follow along and set up my exercises.
One down: It would be nice to have access to recorded lectures and activities for reference.
One down: I stupidly "update" mac right before workshop started, it took 40 minutes to reboot. Had to rush using another laptop to sign-up for the workshop
One up: learned quite some from this workshop, and enjoyed!
One up : I learned alot about using the terminal because this is my first time using it. Also I enjoyed the interactive EtherPad
One Down: Because I am not familiar with the coding commands I may move a little slower than the presenter.
One Up: Learned how to make a repository
One Down: Moving too fast with explinations
- One up: I did figure out how to use the terminal on my MacBook.
- One Down: I still really have no clue what I am doing! I tried to keep up!
- Britt:
- One up: I liked that it went at a steady pace as well as having access to the etherpad. It made it really easy to follow along and stay on task. The instructors were also very helpful and willing to answer questions
- One Down: In the future having two break out rooms in which one is for mac users and the other for windows if possible. Although most requirements were the same, some were not which made it confusing for some users.
- One Up: I enjoying new material and love using the etherpad. We used in a previous session and I have used in other classes and meetings
- One Down: I think I may be in over my head, not quitting but feel as though I may need an in between or prep class
DAY 1 Summary
Unix Shell Commands
ls
mkdir
pwd
cp
mv
rm
head
tail
do
done
wc
cat
nano
echo
Git and GitHub
- Create a repository
- Collaborate:
- Fork a repository
- Make changes in a branch
- Make a pull request
- Merge a pull request
- Make a GitHub pages site
- Markdown format
- Local Git repository
Further Resources
https://ale.org/
https://www.meetup.com/ALE-Atlanta-Linux-Enthusiasts/
Day 2
OpenRefine Lesson
https://librarycarpentry.org/lc-open-refine/index.html
https://librarycarpentry.org/lc-open-refine/data/doaj-article-sample.csv
basic overview of joining projects in OpenRefine from the Univ. of Illinois: https://guides.library.illinois.edu/openrefine/joiningprojects
grel resource: https://openrefine.org/docs/manual/grel
Add'l resources from our workshop materials: https://librarycarpentry.org/lc-open-refine/#getting-help
- Welcome to our Library Carpentry workshop!
- Please sign in for today's session (6/21/2023).
-
- Name you'd like to be called | Any preferred pronouns | Favorite Snack
- Issifu Harruna /Banana
- Lyrric Jackson | she/her/hers | Brown butter chocolate chip cookies
- Ann Myatt James | she/her | trail mix with chocolate chips
- Benson Muite | he/him | Bhajia
- Chuang Peng | he/him | Pop corn
- Renee King/7-up pound cake
- Yvonne Phillips /she/her / chocoloate chip cookies
- Lynne Patten/coffee
- Kenzy Scott/ Sour patch kids mixed in movie theater popcorn, She/Her
- Tory / he/him/ Brown butter almond brittle ice cream
- Jessica Afangnibo She/Her/ Almost anything Chocolate
- Michael Dillon / yogurt
- Youseung Kim / He,him / Coffee
- Ernest Alema-Mensah|he/him|melons
- Robin Brice - she/her/ popcorn
- Mareena Pitts/ any fruit
- Tameyah Mathis-Perry /cheetos
Notes
- Authors > Edit Cells > Split Multivalued Cells > use | as separator
- Authors > Edit Cells > Join MultiValued Cells > use | as separator
- Subjects > Edit Cells > Split Multivalued Cells > use | as separator
- Subjects > Edit Cells > Join MultiValued Cells > use | as separator
- Publisher > Facet > Text Facet # in the new tab, experiment with including and excluding entries
- License > Facet > Text Facet # Most popular license, number without licenses, sort by name or count
- Publisher > Text Filter
- Date > Facet > Timeline # not so neat, need to clean data
- # Use facet by blank to find publications without DOI
- DOI > Facet > Customized Facets > Facet by blank # exclude those without DOIs
- # Remove facets
- # Correct language column
- Language > Facet > Text Facet # Change 'English' to 'EN' by clicking on Edit
- Authors > Edit Cells > Split Multivalued Cells > use | as separator
- Authors > Edit Cells > Cluster and edit.... # Use fingerprint method, select some to merge
- Authors > Text Filter # Examine Naveen, should all have been merged
- All > Edit Column > Re-order / Remove Columns # Just examine
- Title > Edit Column > Rename This Column ... # Just examine
- # Further guide https://guides.library.illinois.edu/openrefine/home
- Authors > Sort > Sort ...
- Authors > Sort > Remove Sort
- # GREL general refine expression language
- Publisher > Facet > Text Facet
- Publisher > Edit Cells > Common Transforms > Collapse Consecutive Whitespace # cleaned entry
- # Exercise
- # Highlight Akshantale Enterprises and Society of Pharmaceutical Technocrats in Publisher Text Facet
- Title > Edit Cells > Transform > value.toTitleCase()
- # Deselect Akshantale Enterprises and Society of Pharmaceutical Technocrats
- # Break
- # Examine Undo/Redo
- # Export, pasterd below, but can also download json file
- [
- {
- "op": "core/multivalued-cell-split",
- "columnName": "Authors",
- "keyColumnName": "Title",
- "mode": "separator",
- "separator": "|",
- "regex": false,
- "description": "Split multi-valued cells in column Authors"
- },
- {
- "op": "core/multivalued-cell-join",
- "columnName": "Authors",
- "keyColumnName": "Title",
- "separator": "|",
- "description": "Join multi-valued cells in column Authors"
- },
- {
- "op": "core/multivalued-cell-split",
- "columnName": "Subjects",
- "keyColumnName": "Title",
- "mode": "separator",
- "separator": "|",
- "regex": false,
- "description": "Split multi-valued cells in column Subjects"
- },
- {
- "op": "core/multivalued-cell-join",
- "columnName": "Subjects",
- "keyColumnName": "Title",
- "separator": "|",
- "description": "Join multi-valued cells in column Subjects"
- },
- {
- "op": "core/mass-edit",
- "engineConfig": {
- "facets": [],
- "mode": "row-based"
- },
- "columnName": "Language",
- "expression": "value",
- "edits": [
- {
- "from": [
- "English"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "EN"
- }
- ],
- "description": "Mass edit cells in column Language"
- },
- {
- "op": "core/multivalued-cell-split",
- "columnName": "Authors",
- "keyColumnName": "Title",
- "mode": "separator",
- "separator": "|",
- "regex": false,
- "description": "Split multi-valued cells in column Authors"
- },
- {
- "op": "core/mass-edit",
- "engineConfig": {
- "facets": [],
- "mode": "row-based"
- },
- "columnName": "Authors",
- "expression": "value",
- "edits": [
- {
- "from": [
- "Ashutosh",
- "Ashutosh "
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Ashutosh"
- },
- {
- "from": [
- "A. Khan Vakeel",
- "Vakeel A. Khan"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "A. Khan Vakeel"
- },
- {
- "from": [
- "Chandra Naveen",
- "Naveen Chandra"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Chandra Naveen"
- },
- {
- "from": [
- "B. K. Revathi",
- "B. K Revathi"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "B. K. Revathi"
- },
- {
- "from": [
- "Santiago Garcia-Granda",
- "Santiago García-Granda"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Santiago Garcia-Granda"
- },
- {
- "from": [
- "Jian-Chao Yuan",
- "Jianchao Yuan"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Jian-Chao Yuan"
- },
- {
- "from": [
- "Chang-Ge Zheng",
- "ChangGe Zheng"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Chang-Ge Zheng"
- },
- {
- "from": [
- "Il'ya A. Gural'skiy",
- "Il`ya A. Gural`skiy"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Il'ya A. Gural'skiy"
- },
- {
- "from": [
- "Rongbin Huang",
- "Rong-Bin Huang"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Rongbin Huang"
- },
- {
- "from": [
- "Sheng-Lan Zhao",
- "Shenglan Zhao"
- ],
- "fromBlank": false,
- "fromError": false,
- "to": "Sheng-Lan Zhao"
- }
- ],
- "description": "Mass edit cells in column Authors"
- },
- {
- "op": "core/column-reorder",
- "columnNames": [
- "Title",
- "Authors",
- "DOI",
- "ISSNs",
- "URL",
- "Date",
- "Subjects",
- "Language",
- "Publisher",
- "Citation",
- "Licence"
- ],
- "description": "Reorder columns"
- },
- {
- "op": "core/text-transform",
- "engineConfig": {
- "facets": [
- {
- "type": "text",
- "name": "Authors",
- "columnName": "Authors",
- "query": "",
- "mode": "text",
- "caseSensitive": false,
- "invert": false
- }
- ],
- "mode": "row-based"
- },
- "columnName": "Publisher",
- "expression": "value.replace(/[\\p{Zs}\\s]+/,' ')",
- "onError": "keep-original",
- "repeat": false,
- "repeatCount": 10,
- "description": "Text transform on cells in column Publisher using expression value.replace(/[\\p{Zs}\\s]+/,' ')"
- },
- {
- "op": "core/text-transform",
- "engineConfig": {
- "facets": [
- {
- "type": "text",
- "name": "Authors",
- "columnName": "Authors",
- "query": "",
- "mode": "text",
- "caseSensitive": false,
- "invert": false
- },
- {
- "type": "list",
- "name": "Publisher",
- "expression": "value",
- "columnName": "Publisher",
- "invert": false,
- "omitBlank": false,
- "omitError": false,
- "selection": [
- {
- "v": {
- "v": "Akshantala Enterprises",
- "l": "Akshantala Enterprises"
- }
- },
- {
- "v": {
- "v": "Society of Pharmaceutical Technocrats",
- "l": "Society of Pharmaceutical Technocrats"
- }
- }
- ],
- "selectBlank": false,
- "selectError": false
- }
- ],
- "mode": "row-based"
- },
- "columnName": "Title",
- "expression": "grel:value.toTitlecase()",
- "onError": "keep-original",
- "repeat": false,
- "repeatCount": 10,
- "description": "Text transform on cells in column Title using expression grel:value.toTitlecase()"
- }
- ]
- # Remove all facets and filters
- Date > Facet > Custom Text FAcet > value.toDate("dd/MM/yyyy")
- Date > Edit Column > Add Column Based on this Column > value.toString("dd MMMM yyyy")
- Authors > Edit Cells > Split Multivalued Cells > use | as separator
- Authors > Facet > Custom Text Facet > value.contains(",")
- # Under All star first row
- All > Facet > Facet by Star
- # Select True row in Facet
- URL > Edit Column > Add column by fetching URL based on column URL
- expand show # text is a little small, hard to find
- #
- Add
- Lyrric's Command History for OpenRefine
- Split multi-valued cells in column Authors
- Join multi-valued cells in column Authors
- Split multi-valued cells in column Subjects
- Join multi-valued cells in column Subjects
- Mass edit cells in column Language
- Split multi-valued cells in column Authors
- Mass edit cells in column Authors
- Reorder columns
- Text transform on cells in column Publisher using expression value.replace(/[\p{Zs}\s]+/,' ')
- Text transform on cells in column Title using expression grel:value.toTitlecase()
- Text transform on cells in column
- Date using expression grel:value.toDate("dd/MM/yyyy")
- Create column Formatted Date at index 4 based on column Date using expression grel:value.toString("dd MMMM yyyy")
- Split multi-valued cells in column Authors
- Text transform on cells in column Authors using expression grel:value.split(", ").reverse().join(" ")
- Star row 1
- Create column Journal-Details at index 7 by fetching URLs based on column ISSNs using expression grel:"https://api.crossref.org/journals/"+value
- Star row 2
- Unstar row 2
- Star row 3
- Star row 5
- Reconcile cells in column Publisher to type /organization/organization
- Match item Molecular Diversity Preservation International (126554238) for cells containing "MDPI AG" in column Publisher
- Unstar row 1
- Unstar row 3
- Unstar row 5
- Reconcile cells in column Publisher to type /organization/organization
- Match item International Union of Crystallography. (158070937) for cells containing "International Union of Crystallography" in column Publisher
- Match each cell to its best recon candidate in column Publisher
- Create column VIAF-ID at index 3 based on column Publisher using expression grel:cell.recon.match.id
- Day 2: Git continued
- Working in the shell/terminal/git bash
- setup a git repository
- user.name = Benson Muite
- user.email=benson_muite@emailplus.org
- color.ui=auto
- clear
- core.editor=nano -w
- credential.helper=cachels
- init.defaultbranch=main
- sendemail.smtpserver=smtp.fastmail.com
- sendemail.smtpuser=benson_muite@emailplus.org
- sendemail.smtpencryption=tls
- sendemail.smtpserverport=587
- sendemail.smtppass=ws6z9h99gqvmgvae
- http.https://repo.or.cz.sslcert=/home/benson/certs/rorcz_bkmgit_user_1.pem
- http.https://repo.or.cz.sslkey=/home/benson/.ssh/repo_or_cz_rsa
- http.https://repo.or.cz.sslcertpasswordprotected=tru
- [benson@localhost hello_world]$ git config --global init.defaultBranch main
- [benson@localhost hello_world]$ git init
- Initialized empty Git repository in /home/benson/Projects/Carpentries/Carpentries
- Helping/June-20-2023-AUC/git/hello_world/.gitgit log
- [benson@localhost hello_world]$ ls
- [benson@localhost hello_world]$ ls -a
- [benson@localhost hello_world]$ ls -a .git
- $mv bash_script ..git log
- $ls
- $index.md
- $git index.md
- git: 'index.md' is not a git command. See 'git --help'.
- $git @index.md
- git: '@index.md' is not a git command. See 'git --help'.
- $git add index.md
- warning: in the working copy of 'index.md', LF will be replaced by CRLF the next time Git touches it
- $git status
- On branch main
- No commits yet
- Changes to be committed:
- (use "git rm --cached <file>..." to unstage)
$git commit
[main (root-commit) 2e6ef11] Add initial file index.md
- pathspec 'index.md' did not match any files
- . .. .git
- 1082 cd git
- 1083 ls
- 1084 mkdir hello_world
- 1085 cd hello_world
- 1086 git config --global user.name "Benson Muite"
- 1087 git config --global user.email "benson_muite@emailplus.org"
- 1088 git config --list
- 1089 git config --global core.editor "nano -w"
- 1090 git config --list
- 1091 git config --global init.defaultBranch main
- 1092 git init
- 1093 ls gitgit
- 1096 ls
- 1097 nano index.md
- 1098 git add index.md
- 1099 git status
- 1100 git commit
- 1101 ls
- 1102 git status
- 1103 git log
- 1104 history
- 1105 ls ~/.ssh
- 1106 clear
- 1107 ssh-keygen -t ed25519 -C "benson_muite@emailplus.org"
- 331 ls
- 332 ls -lrt
- 333 git add index.md
- 334 git status
- 335 git commit
- 336 git status
- 337 git log
- 338 ssh-keygen -t ed25519 -C "robinbrice11@gmail.com"
- 339 ls ~/.ssh
- 340 cat ~/.ssh
- 341 cat ~/.ssh/id_ed25519.pub
- 342 git remote add origin https://github.com/rbrice11/Hello-World.git\ngit branch -M main\ngit push -u origin main
- 343 git remote add origin git@github.com:rbrice11/Hello-World.git\ngit branch -M main\ngit push -u origin main
- 344 git remote rm origin
- 345 git remote add origin git@github.com:rbrice11/Hello-World.git\ngit branch -M main\ngit push -u origin main
- 346 git pull
- # https://www.howtogeek.com/devops/how-to-switch-a-github-repository-to-ssh-authentication/
- https://git-send-email.io
An interesting open source project, Repository (GitHub/Gitee/GitLab/SourceHut/BitBucket etc)
Day 2: SQL Lesson Materials
https://sqlitebrowser.org/dl/
https://zenodo.org/record/2822005
About Zenodo as a resource for making files, data, and research open, findable, and sharable: https://about.zenodo.org/
Resource that discusses the various flavors of SQL:
https://medium.com/the-everything-blog/sql-vendors-e31e417db7c2
SQL 1
SELECT title
FROM articles;
SELECT Title, Authors, ISSNs, Year, DOI
FROM articles;
SELECT *
FROM articles;
SELECT DISTINCT ISSNs
FROM articles;
SELECT DISTINCT ISSNs
FROM journals;
SELECT DISTINCT ISSNs, Day, Month, Year
FROM articles;
SELECT *
FROM articles
ORDER BY ISSNs ASC;
SELECT DISTINCT *
FROM articles
ORDER BY ISSNs ASC;
SELECT *
FROM articles
ORDER BY ISSNs DESC, First_Author ASC;
SELECT *
FROM articles
WHERE ISSNs='2056-9890';
SELECT *
FROM articles
WHERE (ISSNs='2056-9890') AND (Month>10);
SELECT *
FROM articles
WHERE (ISSNs='2076-0787') OR (ISSNs='2077-144');
SELECT *
FROM articles
WHERE Subjects LIKE '%Crystal Structures%';
SELECT *
FROM articles
WHERE (ISSNs = '2076-0787') OR (ISSNs = '2077-1444')
OR (ISSNs='2067-2764|2247-6202');
SELECT *
FROM articles
WHERE (ISSNs = '2076-0787') OR (ISSNs='2077-1444');
/* This is a block comment
for another example which spans multiple lines*/
-- Comments on a single link can make use of just two dashes
SELECT articles.Title, articles.First_Author,
journals.Journal_Title, publishers.Publisher
-- from the first TABLE
FROM articles
-- join it with the second TABLE
JOIN journals
-- First we have the fields we want to display
SELECT *
--from the first TABLE
FROM articles
--join it with the second TABLE
JOIN journals
/* Aggregation */
SELECT ISSNs, AVG(Citation_Count)
FROM articles
GROUP BY ISSNs;
SELECT ISSNs, AVG(Citation_Count)
FROM articles
GROUP BY ISSNs
ORDER BY AVG(Citation_Count) DESC;
SELECT ISSNs, AVG(Citation_Count)
FROM articles
GROUP BY ISSNs
HAVING count(Title) >=10
ORDER BY AVG(Citation_Count) DESC;
/* Save a Query */
CREATE VIEW journal_counts AS
SELECT ISSNs, COUNT(*)
FROM articles
GROUP BY ISSNs;
SELECT *
FROM journal_counts;
DROP VIEW journal_counts;
/* Create a database table */
CREATE TABLE journals2(id text, ISSN-L text, ISSNs text,
PublisherId text, Journal_Title text);
CREATE TABLE journals2(id text, ISSNs text);
We'd appreciate you all to complete the post-workshop survey which provides us useful feedback for future workshops!
https://carpentries.typeform.com/to/UgVdRQ?slug=2023-06-20-aucenter-online