Welcome to The Carpentries Etherpad! If the etherpad goes down, we can switch to this google doc https://docs.google.com/document/d/1Ap3k_7ZKV__qhQphlgXjJHnJimfNj_WXRrqK38kdRfY/edit?usp=sharing This pad is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents. Use of this service is restricted to members of The Carpentries community; this is not for general purpose use (for that, try etherpad.wikimedia.org). Users are expected to follow our code of conduct: https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html All content is publicly available under the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ ## Checklist * Logistics - restrooms, water fountain, emergency exits, emergency contact * Review Code of Conduct with learners (https://docs.carpentries.org/topic_folders/policies/code-of-conduct.html) We are all learners! * Can you see the text (make it bigger?), can you hear us (speak louder?), can you slow down?, any other access questions? * Schedule: OpenRefine, GitHub * Remind learners to use sticky notes to give feedback * Get feedback at lunch and end of each day using sticky notes * Collect attendee names * Twitter hashtag is #all_lc and MRC Twitter handle https://twitter.com/all_metadata * Lessons are online at https://librarycarpentry.org/lessons/ * Send out the post-workshop survey (https://www.surveymonkey.com/r/lcpostworkshopsurvey?workshop_id=2019-01-05-MTSU) Roll Call Garumma Feyissa/Drexel University/gtf27@drexel.edu/https://twitter.com/garummafeyissa Juliane Schneider / Harvard Catalyst / Juliane_Schneider@hms.harvard.edu / @JulianeS Chris Erdmann / UNC RENCI / erdmannc@renci.org / @libcce Claire King/Drexel University/Design Research/cak@drexel.edu Liz Jones-Minsinger/Haverford College Libraries/ejonesmins@haverford.edu/@LizJonesAll1Wrd Hanlin Zhang / UNC - Chapel Hill/ hanlin.zhang@unc.edu/ Hyung Wook Choi / Drexel University / hc685@drexel.edu YeJin Choi/ Ewha womans Univ. / brightyejin@gmail.com Sonia Pascua / Drexel University / smp458@drexel.edu Wade Bishop / University of Tennessee Alyson Gamble/Simmons University/gamblea@simmons.ed Julie Coy/Haverford College Libraries/jcoy@haverford.edu Jamillah Gabriel | UI Champaign-Urbana | jrg3@illinois.edu Bridget Disney / University of MIssouri / bridget.disney@gmail.com/@bridget_disney Julaine Clunis | Kent State University | jclunis@kent.edu Zakiya Collier / Schomburg Center for Researchin Black Culture, NYPL Deborah Garwood/ Drexel University / dgarwood@drexel.edu Tammi Lawson, Schomburg Center for Research in Black Culture, NYPL Karen Boyd / University of Maryland / klboyd@umd.edu / @karen_leslie Minh Pham/ University of Missouri, Columbia/ mtpr3d@mail.missouri.edu Sam Grabus/Drexel University, Philadelphia/ smg383@Drexel.edu Kathleen Padova / Drexel University / EPAM Systems / kjp24@drexel.edu Kat Murray/Drexel University/CCI Adjunct-DUCOM ENT-Otolaryngology Instructor/kathleen.murray at-sign drexel.edu Agenda Open Science Concepts "Open Science" itself Open Science vs Open Data vs Open Research vs Open Access Linked Open Data ( an example visualization https://karenleslie.github.io/theory-family-tree/colorcodedbytype.html ) Forking in GitHub data carpentry Etherpad XSLT SPARQL Query [software specific terms; e.g., R] Archive Persistence Hadoop Linux EUP data science potential (librarians vs programmers) - bridging the gap what is meant by "open"? (free, accessible) methods - not standardized, more like exploration, best methods data architectures machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) distributed data OpenRefine BEFORE WE START: You can download doaj-article-sample.csv https://raw.githubusercontent.com/LibraryCarpentry/lc-open-refine/gh-pages/data/doaj-article-sample.csv which is a csv file that will open in a new browser tab. Be sure to right click or control click in order to save the file (NOTE: In Safari, right click and select download linked file; in Chrome and Firefox, right click and select save link as). Make a note of the location (i.e the folder, your desktop) to which you save the file. Opening OpenRefine If you click on the icon and nothing happens, open a browser and type http://127.0.0.1:3333/. That usually opens the tool in the browser. OpenRefine is most useful where you have data in a simple tabular format such as a spreadsheet, a comma separated values file (csv) or a tab delimited file (tsv) but with internal inconsistencies either in data formats, or where data appears, or in terminology used. OpenRefine can be used to standardize and clean data across your file. It can help you: * Get an overview of a data set * Resolve inconsistencies in a data set, for example standardizing date formatting * Help you split data up into more granular parts, for example splitting up cells with multiple authors into separate cells * Match local data up to other data sets, for example in matching local subjects against the Library of Congress Subject Headings * Enhance a data set with data from other sources Writing Transformations GREL expression used: value.toTitlecase() Transforming Strings, Numbers, Dates and Booleans GREL expressions used: value.toDate("dd/MM/yyyy") value.toString("dd MMMM yyyy") value.contains(",").toString() Transformations - Handling Arrays GREL expressions used: value.contains(",").toString() value.match(/(.*),(.*)/) value.match(/(.*),(.*)/).reverse().join(" ") Advanced OpenRefine functions Crossref API Exercise: Crossref API: https://github.com/CrossRef/rest-api-doc Under the "Show" button add ; mailto: The syntax for requesting journal information from CrossRef is http://api.crossref.org/journals/{ISSN} GREL expression to extract title: value.parseJson().message.title VIAF Reconciliation Exercise Where to find the rec service: http://refine.codefork.com/ GREL expression for extracting VIAF IDs: cell.recon.match.id Free Your Metadata: https://freeyourmetadata.org/ Regex lesson: https://librarycarpentry.org/lc-data-intro/01-regular-expressions/index.html GitHub https://github.com/libcce https://github.com/pitviper6 https://github.com/ejonesmins https://github.com/juliecoy https://github.com/kpadova https://github.com/bradleywadebishop https://github.com/SoniaMPascua https://github.com/karenleslie https://github.com/choihywook https://github.com/gamblealyson https://github.com/choiyejin87 https://github.com/deborah-g https://github.com/hanlin-zhang https://github.com/cvkeng https://github.com/bridgetdisney https://github.com/sgrabu1 https://github.com/z-collier https://github.com/jamrasgab https://github.com/MinhphamMizzou https://github.com/jazzmurr https://julclunis.github.io/project-website/ A little background: Linus Torvalds and Linux Devised a collaborative/asynchronous approach to developing code Trouble creating a welcoming environment https://adtmag.com/blogs/dev-watch/2014/04/linus-torvalds-rants.aspx GitHub made Git more accessible (because not all of us work on the command line 24/7) Like Linus, GitHub has had its troubles https://techcrunch.com/2019/11/13/github-faces-more-resignations-in-light-of-ice-contract/ Gitea, Gogs, maybe GitLab (employees have resigned over their policies as well) https://gitea.io/en-us/ GitHub’s increasing popularity in research: https://www.nature.com/news/democratic-databases-science-on-github-1.20719 Code, data, workflows, papers, websites, lessons… https://github.com/LibraryCarpentry/lc-open-refine Let's create GitHub accounts: Create your GitHub account and list it here: https://github.com/libcce Create your first repository (folder): - Click on the plus + button (top right corner) - Give the repository a name (openrefine-history), choose public, initialize with a README, and give a license Upload your OpenRefine history JSON file to a folder called crossref. Create a new file —> crossref/placeholder.txt. Create a commit message (more on this) and click create new file. Then upload your JSON file to crossref folder (with a commit message as well). Write your commit messages in the imperative: https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53 Fix bug instead of fixes bug Be detailed and don’t try to do too much Why? It’s easier for the person that has to review your changes :) Click on pencil icon to edit/chage the file you uploaded. Change crossref API to. https://portal.issn.org/resource/ISSN/. Write a commit message and commit change. Click on commits and then the hash to see the diff of the change you just made. ### Why is this important? GitHub-Zenodo —> share and preserve -> reproducibility, credit https://guides.github.com/activities/citable-code/ You can also use GitHub for your papers, e.g.: Click the fork button to copy the repository to your own GitHub organization. ### You can also find other themes and fork them, for example, for event websites: https://github.com/libcce/TriangleJupyter ### Create a new repository and call it project-website. Go to settings, choose GitHub Pages —> None —> Master Branch and choose a theme (Minimal). Jekyll is a simple, blog-aware, static site generator for personal, project, or organization sites. Written in Ruby by Tom Preston-Werner, GitHub's co-founder, it is distributed under the open source MIT license. ### Post-it exercise w/ cats (demo of a pull request) https://guides.github.com/introduction/flow/ ### https://github.com/libcce/project-website-again Go to libcce/project-website/README.md —> click pencil edit icon —> message telling you… You’re editing a file in a project you don’t have write access to. Submitting a change to this file will write it to a new branch in your fork /project-website, so you can send a pull request. Add text/Markdown (https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) at the end of the file, add a commit message, propose the file change, and submit the pull request. We will review and merge the pull request together. That’s one way to create a website. Another option, Netlify + Hugo: Netlify - offers hosting and serverless backend services for static websites (Jekyll, Hugo…) - continuous deployment from git - support JAMStack - Javascript, APIs, Markdown https://jamstack.org/ Hugo is a popular open-source static site generator and the Academic framework allows you to create static academic sites. Let’s install Academic https://sourcethemes.com/academic/ Get Started —> Install w/ Netlify —> Connect GitHub —> repository name —> Walk through instructions. Other themes to explore, e.g. for your lab. If there is time… GitHub Desktop https://desktop.github.com If you start using Git more, this is a great tool for making wider changes For later… Learn about git via the command line… https://librarycarpentry.org/lc-git/