[pdftex] Consider removing dependence of PDF ID field on current directory name

Anders Kaseorg andersk at mit.edu
Sat Sep 2 22:45:50 CEST 2017


On Sat, 2 Sep 2017, Karl Berry wrote:
>     With that in mind, could printID be changed to avoid depending on the 
>     current directory name, either by default
> 
> I think we can change it by default. Patches welcome.

Alright.  How does this look?

Anders


diff --git a/source/src/texk/web2c/pdftexdir/ChangeLog b/source/src/texk/web2c/pdftexdir/ChangeLog
index 116541e8..a5ebe6ea 100644
--- a/source/src/texk/web2c/pdftexdir/ChangeLog
+++ b/source/src/texk/web2c/pdftexdir/ChangeLog
@@ -1,3 +1,9 @@
+2017-09-02  Anders Kaseorg  <andersk at mit.edu>
+
+	* utils.c (printID): Do not hash the current directory name into
+	the PDF ID field, since any randomness in it would lead to
+	non-reproducible builds.
+
 2017-03-16  Pali Roh\'ar <pali.rohar at gmail.com>
 
 	Allow .enc files for bitmap fonts, following thread at
diff --git a/source/src/texk/web2c/pdftexdir/utils.c b/source/src/texk/web2c/pdftexdir/utils.c
index 67ff8e9d..fda97666 100644
--- a/source/src/texk/web2c/pdftexdir/utils.c
+++ b/source/src/texk/web2c/pdftexdir/utils.c
@@ -697,9 +697,10 @@ void unescapehex(poolpointer in)
   </blockquote>
   This stipulates only that the two IDs must be identical when the file is
   created and that they should be reasonably unique. Since it's difficult
-  to get the file size at this point in the execution of pdfTeX and
-  scanning the info dict is also difficult, we start with a simpler
-  implementation using just the first two items.
+  to get the file size at this point in the execution of pdfTeX, scanning
+  the info dict is also difficult, and any randomness in the current
+  directory name would lead to non-reproducible builds, we start with a
+  simpler implementation using just the current time and the file name.
  */
 void printID(strnumber filename)
 {
@@ -707,29 +708,13 @@ void printID(strnumber filename)
     md5_byte_t digest[16];
     char id[64];
     char *file_name;
-    char pwd[4096];
     /* start md5 */
     md5_init(&state);
     /* get the time */
     initstarttime();
     md5_append(&state, (const md5_byte_t *) start_time_str, strlen(start_time_str));
     /* get the file name */
-    if (getcwd(pwd, sizeof(pwd)) == NULL)
-        pdftex_fail("getcwd() failed (%s), path too long?", strerror(errno));
-#ifdef WIN32
-    {
-        char *p;
-        for (p = pwd; *p; p++) {
-            if (*p == '\\')
-                *p = '/';
-            else if (IS_KANJI(p))
-                p++;
-        }
-    }
-#endif
     file_name = makecstring(filename);
-    md5_append(&state, (const md5_byte_t *) pwd, strlen(pwd));
-    md5_append(&state, (const md5_byte_t *) "/", 1);
     md5_append(&state, (const md5_byte_t *) file_name, strlen(file_name));
     /* finish md5 */
     md5_finish(&state, digest);


More information about the pdftex mailing list