1.3. Modify content¶
So far, we’ve only added new content to the dataset. And we have not done much to that content up to this point, to be honest. Let’s see what happens if we add content, and then modify it.
For this, in the root of DataLad-101
, create a plain text file
called notes.txt
. It will contain all of the notes that you take
throughout the course.
Let’s write a short summary of how to create a DataLad dataset from scratch:
“One can create a new dataset with ‘datalad create [–description] PATH’. The dataset is created empty”.
This is meant to be a note you would take in an educational course.
You can take this note and write it to a file with an editor of your choice.
The code below, however, contains this note within the start and end part of a
here document.
You can also copy the full code snippet, starting
from cat << EOT > notes.txt
, including the EOT
in the last line, in your
terminal to write this note from the terminal (without any editor) into notes.txt
.
How does a here-document work?
The code snippet below makes sure to write lines of text into a
file (that so far does not exist) called notes.txt
.
To do this, the content of the “document” is wrapped in between delimiting identifiers. Here, these identifiers are EOT (short for “end of text”), but naming is arbitrary as long as the two identifiers are identical. The first “EOT” identifies the start of the text stream, and the second “EOT” terminates the text stream.
The characters <<
redirect the text stream into
“standard input” (stdin),
the standard location that provides the input for a command.
Thus, the text stream becomes the input for the
cat command, which takes
the input and writes it to
“standard output” (stdout).
Lastly, the >
character takes stdout
can creates a new file
notes.txt
with stdout
as its contents.
It might seem like a slightly convoluted way to create a text file with
a note in it. But it allows to write notes from the terminal, enabling
this book to create commands you can execute with nothing other than your terminal.
You are free to copy-paste the snippets with the here-documents,
or find a workflow that suites you better. The only thing important is that
you create and modify a .txt
file over the course of the Basics part of this
handbook.
Running the command below will create notes.txt
in the
root of your DataLad-101
dataset:
Heredocs don’t work under non-Git-Bash Windows terminals
Heredocs rely on Unix-type redirection and multi-line commands – which is not supported on most native Windows terminals or the Anaconda prompt on Windows. If you are using an Anaconda prompt or a Windows terminal other than Git Bash, instead of executing heredocs, please open up an editor and paste and save the text into it.
The relevant text in the snippet below would be:
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty
If you are using Git Bash, however, here docs will work just fine.
$ cat << EOT > notes.txt
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty
EOT
Run datalad status
(manual) to confirm that there is a new, untracked file:
$ datalad status
untracked: notes.txt (file)
Save the current state of this file in your dataset’s history. Because it is the only modification in the dataset, there is no need to specify a path.
$ datalad save -m "Add notes on datalad create"
add(ok): notes.txt (file)
save(ok): . (dataset)
action summary:
add (ok: 1)
save (ok: 1)
But now, let’s see how changing tracked content works.
Modify this file by adding another note. After all, you already know how to use
datalad save
(manual), so write a short summary on that as well.
Again, the example below uses Unix commands (cat
and redirection, this time however
with >>
to append new content to the existing file)
to accomplish this, but you can take any editor of your choice.
$ cat << EOT >> notes.txt
The command "datalad save [-m] PATH" saves the file (modifications) to
history.
Note to self: Always use informative, concise commit messages.
EOT
Let’s check the dataset’s current state:
$ datalad status
modified: notes.txt (file)
and save the file in DataLad:
$ datalad save -m "add note on datalad save"
add(ok): notes.txt (file)
save(ok): . (dataset)
action summary:
add (ok: 1)
save (ok: 1)
Let’s take another look into our history to see the development of this file.
We’re using git log -p -n 2
(manual) to see last two commits and explore
the difference to the previous state of a file within each commit.
$ git log -p -n 2
commit 9783a90c✂SHA1
Author: Elena Piscopia <elena@example.net>
Date: Tue Jun 18 16:13:00 2019 +0000
add note on datalad save
diff --git a/notes.txt b/notes.txt
index 3a7a1fe..0142412 100644
--- a/notes.txt
+++ b/notes.txt
@@ -1,3 +1,7 @@
One can create a new dataset with 'datalad create [--description] PATH'.
The dataset is created empty
+The command "datalad save [-m] PATH" saves the file (modifications) to
+history.
+Note to self: Always use informative, concise commit messages.
+
commit b538dae3✂SHA1
Author: Elena Piscopia <elena@example.net>
Date: Tue Jun 18 16:13:00 2019 +0000
Add notes on datalad create
diff --git a/notes.txt b/notes.txt
new file mode 100644
We can see that the history can not only show us the commit message attached to
a commit, but also the precise change that occurred in the text file in the commit.
Additions are marked with a +
, and deletions would be shown with a leading -
.
From the dataset’s history, we can therefore also find out how the text file
evolved over time. That’s quite neat, isn’t it?
git log has many more useful options
git log
, as many other Git
commands, has a good number of options
which you can discover if you run git log --help
. Those options could
help to find specific changes (e.g., which added or removed a specific word
with -S
), or change how git log
output will look (e.g.,
--word-diff
to highlight individual word changes).