Modify content

So far, we’ve only added new content to the dataset. And we have not done much to that content up to this point, to be honest. Let’s see what happens if we add content, and then modify it.

For this, in the root of DataLad-101, create a simple text file called notes.txt. It will contain all of the notes that you take throughout the course.

Use an editor of your choice and write a short summary of how to create a DataLad dataset from scratch. The code below contains the note within the start and end part of a here document:

“One can create a new dataset with ‘datalad create [–description] PATH’. The dataset is created empty”.

This is meant to be a note you would take in an educational course. You can take this note and write it to a file with an editor. But you can also copy the full code snippet, starting from cat << EOT > notes.txt, including the EOT in the last line, in your terminal to write this note into notes.txt.

Find out more: How does a here-document work?

The code snippet below makes sure to write lines of text into a file (that so far does not exist) called notes.txt.

To do this, the content of the “document” is wrapped in between delimiting identifiers. Here, these identifiers are EOT (short for “end of text”), but naming is arbitrary as long as the two identifiers are identical. The first “EOT” identifies the start of the text stream, and the second “EOT” terminates the text stream.

The characters << redirect the text stream into “standard input” (stdin), the standard location that provides the input for a command. Thus, the text stream becomes the input for the cat command, which takes the input and writes it to “standard output” (stdout).

Lastly, the > character takes stdout can creates a new file notes.txt with stdout as its contents.

It might seem like a slightly convoluted way to create a text file with a simple note in it. But it allows to write notes from the terminal, enabling this book to create commands you can execute with nothing other than your terminal. You are free to copy-paste the snippets with the here-documents, or find a workflow that suites you better. The only thing important is that you create and modify a .txt file over the course of the Basics part of this handbook.

Running the command below will create notes.txt in the root of your DataLad-101 dataset:

$ cat << EOT > notes.txt
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty

EOT

Run datalad status to confirm that there is a new, untracked file:

$ datalad status
untracked: notes.txt (file)

Save current state of this file in DataLad version history. Because it is the only modification in the dataset, there is no need to specify a path.

$ datalad save -m "Add notes on datalad create"
add(ok): notes.txt (file)
save(ok): . (dataset)
action summary:
  add (ok: 1)
  save (ok: 1)

Modify this file by adding another note. After all, you already know how to use datalad save, so write a short summary on that as well.

Again, the example below uses Unix commands (cat and redirecting, this time however with >> to append new content to the existing file) to accomplish this, but you can take any editor of your choice.

$ cat << EOT >> notes.txt
The command "datalad save [-m] PATH" saves the file
(modifications) to history. Note to self:
Always use informative, concise commit messages.

EOT

Let’s check the dataset’s current state:

$ datalad status
 modified: notes.txt (file)

and save the file in DataLad:

$ datalad save -m "add note on datalad save"
add(ok): notes.txt (file)
save(ok): . (dataset)
action summary:
  add (ok: 1)
  save (ok: 1)

Let’s take another look into our history to see the development of this file. We’re using git log -p -n 2 to see last two commits and explore the difference to the previous state of a file within each commit. Note: git log may be opened in a pager. You can get out of it by pressing q.

$ git log -p -n 2
commit 5f98f2b0582fab3a7033dfcdffaa13995404be8a
Author: Elena Piscopia <elena@example.net>
Date:   Tue Nov 12 15:05:06 2019 +0100

    add note on datalad save

diff --git a/notes.txt b/notes.txt
index 3a7a1fe..bfa64d7 100644
--- a/notes.txt
+++ b/notes.txt
@@ -1,3 +1,7 @@
 One can create a new dataset with 'datalad create [--description] PATH'.
 The dataset is created empty
 
+The command "datalad save [-m] PATH" saves the file
+(modifications) to history. Note to self:
+Always use informative, concise commit messages.
+

commit b72b237b63a26391d4eeccd077d33100e334a506
Author: Elena Piscopia <elena@example.net>
Date:   Tue Nov 12 15:05:05 2019 +0100

    Add notes on datalad create

diff --git a/notes.txt b/notes.txt
new file mode 100644

We can see that the history can not only show us the commit message attached to a commit, but also the precise change that occurred in the text file in the commit. Additions are marked with a +, and deletions would be shown with a leading -. That’s quite neat, isn’t it?

Find out more: git log has many more useful options

git log, as many other git commands, has a good number of options which you can discover if you run git log --help. Those options could help to find specific changes (e.g., which added or removed a specific word with -S), or change how git log output will look (e.g., --word-diff to highlight individual word changes in the -p output).