Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add patch to support Unicode filenames in windows #47

Open
edisongustavo opened this issue Oct 10, 2016 · 8 comments
Open

Add patch to support Unicode filenames in windows #47

edisongustavo opened this issue Oct 10, 2016 · 8 comments

Comments

@edisongustavo
Copy link

edisongustavo commented Oct 10, 2016

Hello,

The HDF5 library currently does not support filenames with unicode characters.
This can be seen on these threads: here and here.

This can be easily done through a patch. At the company that I work for, we have our own patched version of HDF5, but we're moving to conda-forge and we want to contribute it back.

Please ignore references to our issue tracker. The patch we use is this:

From d8e1bde709d643cabc2433164ca274fc77602a6c Mon Sep 17 00:00:00 2001
From: Tiago de Holanda Cunha Nobrega <[email protected]>
Date: Fri, 21 Nov 2014 14:46:46 -0200
Subject: [PATCH] Patch code to use _wopen() on windows

Reference:
http://hdf-forum.184993.n3.nabble.com/Problems-when-using-hdf5-on-non-English-windows-td4027373.html#a4027377

EDEN-851

---
 src/H5FDwindows.c    | 15 +++++++++++++++
 src/H5win32defs.h    |  6 +++++-
 windows/copy_hdf.bat |  4 ++--
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git src/H5FDwindows.c src/H5FDwindows.c
index 28434ec..28a3c81 100644
--- src/H5FDwindows.c
+++ src/H5FDwindows.c
@@ -24,8 +24,23 @@
 #include "H5MMprivate.h"    /* Memory management        */
 #include "H5Pprivate.h"     /* Property lists           */

+#include <stdio.h>
+
 #ifdef H5_HAVE_WINDOWS

+int win_open_patched(const char *name, int oflag, int pmode)
+{
+    // Patch to enable the opening of unicode paths
+    // See: EDEN-851
+    int fd = -1;
+    int name_len = strlen(name);
+    wchar_t* wname = (wchar_t*)malloc( sizeof(wchar_t)*(name_len + 1) );
+    MultiByteToWideChar( CP_UTF8, 0, name, -1, wname, name_len + 1 );
+    fd=_wopen(wname, oflag, pmode);
+    free(wname);
+    return fd;
+}
+
 
 /*-------------------------------------------------------------------------
  * Function:    H5Pset_fapl_windows
diff --git src/H5win32defs.h src/H5win32defs.h
index 5f886d1..9ca0e72 100644
--- src/H5win32defs.h
+++ src/H5win32defs.h
@@ -47,7 +47,11 @@ typedef __int64             h5_stat_size_t;
 /* _O_BINARY must be set in Windows to avoid CR-LF <-> LF EOL
  * transformations when performing I/O.
  */
-#define HDopen(S,F,M)       _open(S,F|_O_BINARY,M)
+//#define HDopen(S,F,M)       _open(S,F|_O_BINARY,M)
+// EDEN-851: Patch hdf5 to open unicode file paths.
+H5_DLL int win_open_patched(const char *name, int oflag, int pmode);
+#define HDopen(S,F,M)       win_open_patched(S,F|_O_BINARY,M)
+
 #define HDread(F,M,Z)       _read(F,M,Z)
 #define HDsetvbuf(F,S,M,Z)  setvbuf(F,S,M,(Z>1?Z:2))
 #define HDsleep(S)          Sleep(S*1000)
-- 
2.6.3.windows.1

Would you be willing to add this to the feedstock? I can contribute it with a PR.

@tadeu
Copy link
Member

tadeu commented Oct 24, 2018

@ocefpaf
Copy link
Member

ocefpaf commented Oct 24, 2018

We missed this one completely. I would be OK with that patch. Do you want to send a PR? We would need to test the new package against some key downstream dependencies, like h5py and pytables for example.

@nicoddemus
Copy link
Member

@ocefpaf we have tested this with pytables, but I suspect it should work with h5py without any issues as well.

The only problem is that it is backward-incompatible: it will assume paths are given as UTF-8 strings, which will break on Windows for users which are passing file names with the proper encoding.

@ocefpaf
Copy link
Member

ocefpaf commented Oct 24, 2018

The only problem is that it is backward-incompatible: it will assume paths are given as UTF-8 strings, which will break on Windows for users which are passing file names with the proper encoding.

That may be problem :-/

To be honest I'm not a Windows user to say what is the best course of action here. I'll ask about this in our next meeting and see what others think.

BTW: our meeting are open to the community, let me know if you two want to pitch that idea yourselves 😜

@k-dominik
Copy link

just want to mention that I would be in favor of this patch being part of the recipe

@tadeu
Copy link
Member

tadeu commented Apr 11, 2019

We'll just have to wait for 1.10.6, HDF have already added this feature upstream ;)

https://github.com/live-clones/hdf5/commit/750b5c293076b6a446088fa3020e4e0787d489d7

@k-dominik
Copy link

thanks for this information @tadeu !!

@k-dominik
Copy link

k-dominik commented Oct 7, 2020

should this be closed with hdf5 1.12 being available?

edit: I guess this should be done once conda-forge has migrated to hdf5 1.12...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants