1.3. Modify content¶
So far, we’ve only added new content to the dataset. And we have not done much to that content up to this point, to be honest. Let’s see what happens if we add content, and then modify it.
For this, in the root of
DataLad-101, create a plain text file
notes.txt. It will contain all of the notes that you take
throughout the course.
Let’s write a short summary of how to create a DataLad dataset from scratch:
“One can create a new dataset with ‘datalad create [–description] PATH’. The dataset is created empty”.
This is meant to be a note you would take in an educational course.
You can take this note and write it to a file with an editor of your choice.
The code below, however, contains this note within the start and end part of a
You can also copy the full code snippet, starting
cat << EOT > notes.txt, including the
EOT in the last line, in your
terminal to write this note from the terminal (without any editor) into
How does a here-document work?
The code snippet below makes sure to write lines of text into a
file (that so far does not exist) called
To do this, the content of the “document” is wrapped in between delimiting identifiers. Here, these identifiers are EOT (short for “end of text”), but naming is arbitrary as long as the two identifiers are identical. The first “EOT” identifies the start of the text stream, and the second “EOT” terminates the text stream.
<< redirect the text stream into
“standard input” (stdin),
the standard location that provides the input for a command.
Thus, the text stream becomes the input for the
cat command, which takes
the input and writes it to
“standard output” (stdout).
> character takes
stdout can creates a new file
stdout as its contents.
It might seem like a slightly convoluted way to create a text file with
a note in it. But it allows to write notes from the terminal, enabling
this book to create commands you can execute with nothing other than your terminal.
You are free to copy-paste the snippets with the here-documents,
or find a workflow that suites you better. The only thing important is that
you create and modify a
.txt file over the course of the Basics part of this
Running the command below will create
notes.txt in the
root of your
Heredocs don’t work under non-Git-Bash Windows terminals
Heredocs rely on Unix-type redirection and multi-line commands – which is not supported on most native Windows terminals or the Anaconda prompt on Windows. If you are using an Anaconda prompt or a Windows terminal other than Git Bash, instead of executing heredocs, please open up an editor and paste and save the text into it.
The relevant text in the snippet below would be:
One can create a new dataset with 'datalad create [--description] PATH'. The dataset is created empty
If you are using Git Bash, however, here docs will work just fine.
$ cat << EOT > notes.txt One can create a new dataset with 'datalad create [--description] PATH'. The dataset is created empty EOT
Run datalad status to confirm that there is a new, untracked file:
$ datalad status untracked: notes.txt (file)
Save the current state of this file in your dataset’s history. Because it is the only modification in the dataset, there is no need to specify a path.
$ datalad save -m "Add notes on datalad create" add(ok): notes.txt (file) save(ok): . (dataset) action summary: add (ok: 1) save (ok: 1)
But now, let’s see how changing tracked content works. Modify this file by adding another note. After all, you already know how to use datalad save, so write a short summary on that as well.
Again, the example below uses Unix commands (
cat and redirection, this time however
>> to append new content to the existing file)
to accomplish this, but you can take any editor of your choice.
$ cat << EOT >> notes.txt The command "datalad save [-m] PATH" saves the file (modifications) to history. Note to self: Always use informative, concise commit messages. EOT
Let’s check the dataset’s current state:
$ datalad status modified: notes.txt (file)
and save the file in DataLad:
$ datalad save -m "add note on datalad save" add(ok): notes.txt (file) save(ok): . (dataset) action summary: add (ok: 1) save (ok: 1)
Let’s take another look into our history to see the development of this file. We’re using git log -p -n 2 to see last two commits and explore the difference to the previous state of a file within each commit.
$ git log -p -n 2 commit f9ac9b9516e5a811a6ba03df3df125d84e00dce8 Author: Elena Piscopia <firstname.lastname@example.org> Date: Sun Jul 31 14:27:33 2022 -0700 add note on datalad save diff --git a/notes.txt b/notes.txt index 3a7a1fe..0142412 100644 --- a/notes.txt +++ b/notes.txt @@ -1,3 +1,7 @@ One can create a new dataset with 'datalad create [--description] PATH'. The dataset is created empty +The command "datalad save [-m] PATH" saves the file (modifications) to +history. +Note to self: Always use informative, concise commit messages. + commit 873bc1dbc523a173eb83e9b78ddf55abfd2b20cc Author: Elena Piscopia <email@example.com> Date: Sun Jul 31 14:27:32 2022 -0700 Add notes on datalad create diff --git a/notes.txt b/notes.txt new file mode 100644
We can see that the history can not only show us the commit message attached to
a commit, but also the precise change that occurred in the text file in the commit.
Additions are marked with a
+, and deletions would be shown with a leading
From the dataset’s history, we can therefore also find out how the text file
evolved over time. That’s quite neat, isn’t it?
git log has many more useful options
git log, as many other
Git commands, has a good number of options
which you can discover if you run
git log --help. Those options could
help to find specific changes (e.g., which added or removed a specific word
-S), or change how
git log output will look (e.g.,
--word-diff to highlight individual word changes in the