Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Invalid argument" error when "creating metadata EA" (OpenWRT, netatalk 3.1.18) #974

Closed
APCCV opened this issue May 12, 2024 · 23 comments
Closed

Comments

@APCCV
Copy link

APCCV commented May 12, 2024

Describe the bug
Attempting to revive the netatalk package for OpenWrt. Using OpenWrt v23.05.3 on ipq806x (arm_cortex-a15_neon-vfpv4) hardware.
Tried to use netatalk 3.1.11 existing packages for older OpenWrt versions and also built new with netatalk 3.1.13 - 3.1.18. Behaviour is identical for all netatalk versions tried out. Using a ext4 fs.
Client is MacOS 10.13.16. Goal is to use Time Machine but issue described below occurs even with non-TM write (copy file through Finder).
This could well be an issue outside netatalk, but here goes.

Client able to connect. Client able to list files on share. However, on attempt to write a file to the share, file is created, zero length. Client reports "error -50" and operation fails.

Log:

May 11 16:57:02.286126 afpd[10559] {ad_open.c:1233} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg", DF|HF|NOHF|O_RDWR|O_CREAT|O_EXCL): BEGIN [dfd: 7 (ref: 1), mfd: 7 (ref: 1), rfd: -1 (ref: 0)]
May 11 16:57:02.286222 afpd[10559] {ad_open.c:768} (debug:ad): ad_header_read_ea: No data available
May 11 16:57:02.286306 afpd[10559] {ad_open.c:1282} (debug:ad): ad_open_hf_ea("IMG_4146 copy 2.jpg"): creating metadata EA
May 11 16:57:02.286522 afpd[10559] {ad_open.c:1286} (error:ad): ad_open_hf_ea: unexpected: Invalid argument

Looking at code, I could not locate where EINVAL error is generated for this case. Log show where error is being thrown, so patched temporarily to ignore the error.

--- a/libatalk/adouble/ad_open.c
+++ b/libatalk/adouble/ad_open.c
@@ -1283,8 +1283,8 @@

         /* It doesnt exist, EPERM or another error */
         if (!(errno == ENOATTR || errno == ENOENT)) {
             LOG(log_error, logtype_ad, "ad_open_hf_ea: unexpected: %s", strerror(errno));
-             EC_FAIL;
+            /* EC_FAIL; */
         }

         /* Create one */

Everything seems to be working fine with this, but clearly this is not a solution. Where is the EINVAL coming from?

To Reproduce
Install netatalk packages on OpenWrt (requires some creativity as the package is no longer available for latest OpenWrt versions - trying to fix that).
Finder, Connect to Server, afp:// target, connect, Copy file from HD, paste to share.

Expected behavior
File gets written with content. Refer to patch above to bypass error.
File EA should be written too... I am not sure that is happening if error is ignored but have no way to verify.
What is generating the EINVAL error?

Environment

  • Server OS: OpenWrt 23.05.3
  • Client OS: MacOs 10.13.16
  • Netatalk Version: 3.1.18 (although tried with several older versions and behaviour was unchanged)

Config:

;
; Netatalk 3.x configuration file
;

[Global]
; Global server settings
log file = /var/log/afpd.log
#log level = afpdaemon:debug9,uamsdaemon:debug9,default:debug9
afp interfaces = br-lan
afp listen = 192.168.128.80:584

[TimeMachine_IM]
path = /mnt/sda1/tm_im
time machine = yes
valid users = @users

Logs

May 11 16:57:19.375363 afpd[10559] {afp_dsi.c:624} (debug:AFPDaemon): <== Start AFP command: AFP_GETFLDRPARAM                                                                                                                                   
May 11 16:57:19.375444 afpd[10559] {directory.c:566} (debug:AFPDaemon): dirlookup(did: 2): START                                                                                                                                                
May 11 16:57:19.375530 afpd[10559] {directory.c:715} (debug:AFPDaemon): dirlookup(did: 2): RESULT: pdid: 1, path: "/mnt/sdb1/tm_im"                                                                                                             
May 11 16:57:19.375614 afpd[10559] {directory.c:1319} (debug:AFPDaemon): movecwd(to: did: 2, "/mnt/sdb1/tm_im")                                                                                                                                 
May 11 16:57:19.375728 afpd[10559] {ofork.c:241} (debug:AFPDaemon): of_stat('/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg': No such file or directory)                                                                                                   
May 11 16:57:19.375815 afpd[10559] {directory.c:1288} (debug:AFPDaemon): cname('/mnt/sdb1/tm_im') {end: curdir:'/mnt/sdb1/tm_im', path:'IMG_4146 copy 2.jpg'}                                                                                   
May 11 16:57:19.375909 afpd[10559] {filedir.c:81} (debug:AFPDaemon): getfildirparams(vid:1, did:2, f/d:e93f/a33f) {cwdid:2, cwd: /mnt/sdb1/tm_im, name:'IMG_4146 copy 2.jpg'}                                                                   
May 11 16:57:19.375993 afpd[10559] {afp_dsi.c:632} (debug:AFPDaemon): ==> Finished AFP command: AFP_GETFLDRPARAM -> AFPERR_NOOBJ                                                                                                                
May 11 16:57:19.376077 afpd[10559] {dsi_cmdreply.c:23} (debug:DSI): dsi_cmdreply(DSI ID: 823, len: 0): START                                                                                                                                    
May 11 16:57:19.376168 afpd[10559] {dsi_cmdreply.c:32} (debug:DSI): dsi_cmdreply(DSI ID: 823, len: 0): END                                                                                                                                      
May 11 16:57:19.376262 afpd[10559] {dsi_stream.c:161} (debug:DSI): from_buf(read: 0, unread:0 , space left: 12582912): returning 0                                                                                                              
May 11 16:57:19.376357 afpd[10559] {dsi_stream.c:161} (debug:DSI): from_buf(read: 0, unread:0 , space left: 12582912): returning 0                                                                                                              
May 11 16:57:19.384042 afpd[10559] {dsi_stream.c:161} (debug:DSI): from_buf(read: 0, unread:0 , space left: 12582912): returning 0                                                                                                              
May 11 16:57:19.384156 afpd[10559] {dsi_stream.c:638} (debug:DSI): dsi_stream_receive: DSI cmdlen: 71                                                                                                                                           
May 11 16:57:19.384243 afpd[10559] {afp_dsi.c:609} (debug:DSI): DSI request ID: 824                                                                                                                                                             
May 11 16:57:19.384323 afpd[10559] {afp_dsi.c:624} (debug:AFPDaemon): <== Start AFP command: AFP_COPYFILE                                                                                                                                       
May 11 16:57:19.384406 afpd[10559] {directory.c:566} (debug:AFPDaemon): dirlookup(did: 2): START                                                                                                                                                
May 11 16:57:19.384492 afpd[10559] {directory.c:715} (debug:AFPDaemon): dirlookup(did: 2): RESULT: pdid: 1, path: "/mnt/sdb1/tm_im"                                                                                                             
May 11 16:57:19.384579 afpd[10559] {directory.c:1319} (debug:AFPDaemon): movecwd(to: did: 2, "/mnt/sdb1/tm_im")                                                                                                                                 
May 11 16:57:19.384694 afpd[10559] {directory.c:1225} (debug:AFPDaemon): cname('/mnt/sdb1/tm_im'): {file: 'IMG_4146 copy.jpg'}                                                                                                                  
May 11 16:57:19.384783 afpd[10559] {directory.c:1288} (debug:AFPDaemon): cname('/mnt/sdb1/tm_im') {end: curdir:'/mnt/sdb1/tm_im', path:'IMG_4146 copy.jpg'}                                                                                     
May 11 16:57:19.384887 afpd[10559] {ad_open.c:1995} (debug:ad): ad_open("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|HF|NOHF|SHRMD|O_RDONLY): BEGIN {d: 1, m: 1, r: 0}[dfd: 6 (ref: 2), mfd: 6 (ref: 2), rfd: -1 (ref: 0)]                           
May 11 16:57:19.384985 afpd[10559] {ad_open.c:1020} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|HF|NOHF|SHRMD|O_RDONLY): BEGIN [dfd: 6 (ref: 2), mfd: 6 (ref: 2), rfd: -1 (ref: 0)]                                          
May 11 16:57:19.385083 afpd[10559] {ad_open.c:1086} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|HF|NOHF|SHRMD|O_RDONLY): END: 0 [dfd: 6 (ref: 3), mfd: 6 (ref: 3), rfd: -1 (ref: 0)]                                         
May 11 16:57:19.385177 afpd[10559] {ad_open.c:1233} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|HF|NOHF|SHRMD|O_RDONLY): BEGIN [dfd: 6 (ref: 3), mfd: 6 (ref: 3), rfd: -1 (ref: 0)]                                       
May 11 16:57:19.385281 afpd[10559] {ad_open.c:1355} (debug:ad): ad_reso_size("IMG_4146 copy.jpg"): BEGIN                                                                                                                                        
May 11 16:57:19.385400 afpd[10559] {ad_open.c:1307} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|HF|NOHF|SHRMD|O_RDONLY): END: 0 [dfd: 6 (ref: 4), mfd: 6 (ref: 4), rfd: -1 (ref: 0)]                                      
May 11 16:57:19.385497 afpd[10559] {ad_open.c:2058} (debug:ad): ad_open("/mnt/sdb1/tm_im/IMG_4146 copy.jpg"): END: 0 {d: 2, m: 2, r: 0}[dfd: 6 (ref: 4), mfd: 6 (ref: 4), rfd: -1 (ref: 0)]                                                     
May 11 16:57:19.385595 afpd[10559] {ad_lock.c:600} (debug:ad): ad_testlock(data, off: 9223372036854775803 (DENY_RD_DATA): BEGIN                                                                                                                 
May 11 16:57:19.385697 afpd[10559] {ad_lock.c:66} (debug:ad): set_lock(fd: 6, F_GETLK, F_WRLCK, off: 9223372036854775803 (DENY_RD_DATA), len: 1): BEGIN                                                                                         
May 11 16:57:19.385785 afpd[10559] {ad_lock.c:613} (debug:ad): ad_testlock: END: 0                                                                                                                                                              
May 11 16:57:19.385871 afpd[10559] {ad_lock.c:600} (debug:ad): ad_testlock(reso, off: 9223372036854775803 (DENY_RD_DATA): BEGIN                                                                                                                 
May 11 16:57:19.385962 afpd[10559] {ad_lock.c:66} (debug:ad): set_lock(fd: 6, F_GETLK, F_WRLCK, off: 9223372036854775805 (DENY_RD_RSRC), len: 1): BEGIN                                                                                         
May 11 16:57:19.386046 afpd[10559] {ad_lock.c:613} (debug:ad): ad_testlock: END: 0                                                                                                                                                              
May 11 16:57:19.386134 afpd[10559] {filedir.c:695} (debug:AFPDaemon): absupath: /mnt/sdb1/tm_im/IMG_4146 copy.jpg                                                                                                                               
May 11 16:57:19.386225 afpd[10559] {directory.c:566} (debug:AFPDaemon): dirlookup(did: 2): START                                                                                                                                                
May 11 16:57:19.386311 afpd[10559] {directory.c:715} (debug:AFPDaemon): dirlookup(did: 2): RESULT: pdid: 1, path: "/mnt/sdb1/tm_im"                                                                                                             
May 11 16:57:19.386512 afpd[10559] {directory.c:1319} (debug:AFPDaemon): movecwd(to: did: 2, "/mnt/sdb1/tm_im")                                                                                                                                 
May 11 16:57:19.386622 afpd[10559] {directory.c:1288} (debug:AFPDaemon): cname('/mnt/sdb1/tm_im') {end: curdir:'/mnt/sdb1/tm_im', path:'.'}                                                                                                     
May 11 16:57:19.386717 afpd[10559] {file.c:1443} (debug:AFPDaemon): copyfile(sfd:-1,s:'/mnt/sdb1/tm_im/IMG_4146 copy.jpg',d:'IMG_4146 copy 2.jpg',n:'IMG_4146 copy 2.jpg')                                                                      
May 11 16:57:19.386824 afpd[10559] {ad_open.c:1995} (debug:ad): ad_open("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|RF|NORF|HF|NOHF|O_RDONLY): BEGIN {d: 2, m: 2, r: 0}[dfd: 6 (ref: 4), mfd: 6 (ref: 4), rfd: -1 (ref: 0)]                         
May 11 16:57:19.386920 afpd[10559] {ad_open.c:1020} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|RF|NORF|HF|NOHF|O_RDONLY): BEGIN [dfd: 6 (ref: 4), mfd: 6 (ref: 4), rfd: -1 (ref: 0)]                                        
May 11 16:57:19.387011 afpd[10559] {ad_open.c:1086} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|RF|NORF|HF|NOHF|O_RDONLY): END: 0 [dfd: 6 (ref: 5), mfd: 6 (ref: 5), rfd: -1 (ref: 0)]                                       
May 11 16:57:19.387102 afpd[10559] {ad_open.c:1233} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|RF|NORF|HF|NOHF|O_RDONLY): BEGIN [dfd: 6 (ref: 5), mfd: 6 (ref: 5), rfd: -1 (ref: 0)]                                     
May 11 16:57:19.387199 afpd[10559] {ad_open.c:1355} (debug:ad): ad_reso_size("/mnt/sdb1/tm_im/IMG_4146 copy.jpg"): BEGIN                                                                                                                        
May 11 16:57:19.387305 afpd[10559] {ad_open.c:1307} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy.jpg", DF|RF|NORF|HF|NOHF|O_RDONLY): END: 0 [dfd: 6 (ref: 6), mfd: 6 (ref: 6), rfd: -1 (ref: 0)]                                    
May 11 16:57:19.387389 afpd[10559] {ad_open.c:1419} (debug:ad): ad_open_rf("/mnt/sdb1/tm_im/IMG_4146 copy.jpg"): BEGIN                                                                                                                          
May 11 16:57:19.387490 afpd[10559] {ad_open.c:1545} (debug:ad): ad_open_rf("/mnt/sdb1/tm_im/IMG_4146 copy.jpg"): END: 0                                                                                                                         
May 11 16:57:19.387581 afpd[10559] {ad_open.c:2058} (debug:ad): ad_open("/mnt/sdb1/tm_im/IMG_4146 copy.jpg"): END: 0 {d: 3, m: 3, r: 0}[dfd: 6 (ref: 6), mfd: 6 (ref: 6), rfd: -1 (ref: 0)]                                                     
May 11 16:57:19.387687 afpd[10559] {ad_open.c:1995} (debug:ad): ad_open("/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg", DF|NORF|HF|NOHF|O_RDWR|O_CREAT|O_EXCL): BEGIN {d: 0, m: 0, r: 0}[dfd: -1 (ref: 0), mfd: -1 (ref: 0), rfd: -1 (ref: 0)]           
May 11 16:57:19.387784 afpd[10559] {ad_open.c:1020} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg", DF|NORF|HF|NOHF|O_RDWR|O_CREAT|O_EXCL): BEGIN [dfd: -1 (ref: 0), mfd: -1 (ref: 0), rfd: -1 (ref: 0)]                          
May 11 16:57:19.387990 afpd[10559] {ad_open.c:1086} (debug:ad): ad_open_df("/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg", DF|NORF|HF|NOHF|O_RDWR|O_CREAT|O_EXCL): END: 0 [dfd: 7 (ref: 1), mfd: 7 (ref: 1), rfd: -1 (ref: 0)]                           
May 11 16:57:19.388113 afpd[10559] {ad_open.c:1233} (debug:ad): ad_open_hf_ea("/mnt/sdb1/tm_im/IMG_4146 copy 2.jpg", DF|NORF|HF|NOHF|O_RDWR|O_CREAT|O_EXCL): BEGIN [dfd: 7 (ref: 1), mfd: 7 (ref: 1), rfd: -1 (ref: 0)]                         
May 11 16:57:19.388210 afpd[10559] {ad_open.c:768} (debug:ad): ad_header_read_ea: No data available                                                                                                                                             
May 11 16:57:19.388302 afpd[10559] {ad_open.c:1282} (debug:ad): ad_open_hf_ea("IMG_4146 copy 2.jpg"): creating metadata EA                                                                                                                      
May 11 16:57:19.388386 afpd[10559] {ad_open.c:1286} (error:ad): ad_open_hf_ea: unexpected: Invalid argument                                                                                                                                                   

No longer have rest of log but at that point damage is done. With temporary hack above things seem normal afterwards.

Additional context
If it is a crash, please attach a stacktrace.

@APCCV
Copy link
Author

APCCV commented May 12, 2024

Related to #368 ?

@rdmark
Copy link
Member

rdmark commented May 12, 2024

Can you please share the contents of your afp.conf file?

Also, if you have the opportunity, try the bleeding edge main branch code and see if the behavior is different there (I doubt it will be, but just in case.)

@APCCV
Copy link
Author

APCCV commented May 12, 2024

Sure. Here goes. Not much there. Thank you very much for taking a look.

;
; Netatalk 3.x configuration file
; 

[Global]
; Global server settings
log file = /var/log/afpd.log
#log level = afpdaemon:debug9,uamsdaemon:debug9,default:debug9
afp interfaces = br-lan
afp listen = 192.168.128.80:584

[TimeMachine_IM]
path = /mnt/sda1/tm_im
time machine = yes
valid users = @users

I have afp listen set as older versions (3.1.12?) were getting messed up with IPv6 addresses. I'll take that out when I get a chance. Should have no impact on this issue. Besides, configured the bridge to be IPv4 only for the time being so the only IP available is a v4 anyways.

I'll look into using the repo's head for a build... might take a while for me to learn how to tweak OpenWrt's package Makefile to pull that version of the code and not use a tagged commit.

@APCCV
Copy link
Author

APCCV commented May 12, 2024

In case it helps... this is how it the build is configured (I think I enable(d)-debugging on last build hoping it would give me something else to look at but I don't know what to look for).

CONFIGURE_ARGS += \
        --disable-afs \
        --disable-hfs \
        --enable-debugging \
        --disable-shell-check \
        --disable-timelord \
        --disable-a2boot \
        --disable-cups \
        --disable-tcp-wrappers \
        --with-cnid-default-backend=dbd \
        --with-bdb="$(STAGING_DIR)/usr/" \
        --with-libevent=no \
        --with-libgcrypt-dir="$(STAGING_DIR)/usr" \
        --with-ssl-dir="$(STAGING_DIR)/usr" \
        --with-uams-path="/usr/lib/uams" \
        --without-acls \
        --without-kerberos \
        --without-mysql \
        --with-mysql-config=false \
        --without-pam \
        --disable-admin-group \
        --disable-srvloc \
        --disable-zeroconf \
        $(if $(CONFIG_SHADOW_PASSWORDS),--with-shadow,--without-shadow) \
        --without-dtrace \
        --without-ldap

So:

root@OpenWrt:/tmp/netatalk# afpd -V
afpd 3.1.18 - Apple Filing Protocol (AFP) daemon of Netatalk

This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version. Please see the file COPYING for further information and details.

afpd has been compiled with support for these features:

          AFP versions:	2.2 3.0 3.1 3.2 3.3 3.4 
         CNID backends:	dbd last tdb 
      Zeroconf support:	No
  TCP wrappers support:	No
         Quota support:	No
   Admin group support:	Yes
    Valid shell checks:	No
      cracklib support:	No
            EA support:	ad | sys
           ACL support:	No
          LDAP support:	No
         D-Bus support:	No
     Spotlight support:	No
         DTrace probes:	No

@rdmark
Copy link
Member

rdmark commented May 13, 2024

I agree that there's nothing in that afp.conf to explain the issue.

What I can tell so far is that something bugs out when filesystem extended attributes are being set. Are you able to read and write EAs manually to an arbitrary file? e.g. by using setfattr and getfattr. This test would tell us that your file system has this capability.

@APCCV
Copy link
Author

APCCV commented May 13, 2024

Some EA stuff is getting written.

root@OpenWrt:/mnt/sda1/tm_im# getfattr -d -m - .
# file: .
user.org.netatalk.Metadata=0sAAUWB<...>AAAAAAA
user.org.netatalk.has-Extended-Attributes="yes"

root@OpenWrt:/mnt/sda1/tm_im/iMac.sparsebundle# getfattr -d -m - *
# file: Info.bckup
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

# file: Info.plist
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

# file: bands
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

# file: com.apple.TimeMachine.MachineID.bckup
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

# file: com.apple.TimeMachine.MachineID.plist
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

# file: com.apple.TimeMachine.Results.plist
user.org.netatalk.Metadata=0sAAUWB<..>DwIA

# file: com.apple.TimeMachine.SnapshotHistory.plist
user.org.netatalk.Metadata=0sAAUWB<..>DwIA

# file: token
user.org.netatalk.Metadata=0sAAUWB<..>AAAA

(netatalk.Metadata values are all different - truncated them for readibility).

Attributes seemingly can be set on a new file:

root@OpenWrt:/mnt/sda1/tm_im# touch testfile
root@OpenWrt:/mnt/sda1/tm_im# attr -s this.is.a testfile
test<CTRL-D>
root@OpenWrt:/mnt/sda1/tm_im# getfattr -d -m - testfile 
# file: testfile
user.this.is.a="test"

root@OpenWrt:/mnt/sda1/tm_im# touch testtoo
root@OpenWrt:/mnt/sda1/tm_im# attr -s this.is.a testtoo
second test<CTRL-D>
root@OpenWrt:/mnt/sda1/tm_im# attr -s this.is.another testtoo 
testAttribute "this.is.another" set to a 4 byte value for testtoo:
test<CTRL-D>
root@OpenWrt:/mnt/sda1/tm_im# getfattr -d -m - testtoo
# file: testtoo
user.this.is.a="second test"
user.this.is.another="test"

And this didn't work, either because I don't know how to use this (probable) or something is not supported:

root@OpenWrt:/mnt/sda1/tm_im# touch doesntwork
root@OpenWrt:/mnt/sda1/tm_im# setfattr -n "this.is.a" -v "test" doesntwork
setfattr: doesntwork: Not supported

I get the same behaviour in Ubuntu so it doesn't seem to be a quirk with OpenWrt attr utilities.


Edit: regarding setfattr, problem is me. Needs the "user" moniker both in Ubuntu and OpenWrt. This works:

root@OpenWrt:/mnt/sda1/tm_im# setfattr -n user.this.is.a -v test.again.and.again testfile 
root@OpenWrt:/mnt/sda1/tm_im# getfattr -m - -d testfile 
# file: testfile
user.this.is.a="test.again.and.again"

@APCCV
Copy link
Author

APCCV commented May 13, 2024

Does it have any relationship with this previous log entry?
May 11 16:57:19.388210 afpd[10559] {ad_open.c:768} (debug:ad): ad_header_read_ea: No data available

Maybe something gets messed up if there is no data to begin with. I don't quite understand what the code is doing but will take another look.

@APCCV
Copy link
Author

APCCV commented May 14, 2024

Adding more log statements to code to try figure out where EINVAL is first assigned. Don't know yet (if ever), but...

  • errno is a global symbol, right? It's not declared in ad_open.c AFAIK
  • errno seems to be EINVAL from the start of ad_open.c. I added a log statement to print errno first thing and it is always 22. From afpd start up, the first time ever this code is called, errno is already EINVAL (second to last log entry in log snippet below, line 2018) and that is pushed around until it bombs when attempting to write EA.
May 14 13:26:01.349669 afpd[7324] {netatalk_conf.c:1578} (debug:AFPDaemon): load_volumes: BEGIN                                                      
May 14 13:26:01.349853 afpd[7324] {netatalk_conf.c:1693} (debug:AFPDaemon): load_volumes: END
May 14 13:26:01.350058 afpd[7324] {cnid_dbd.c:473} (debug:CNID): Finished initializing CNID dbd module for volume 'TimeMachine_IM'
May 14 13:26:01.350164 afpd[7324] {volume.c:313} (debug:AFPDaemon): getvolparams: Volume 'TimeMachine_IM'                                                                                                                  
May 14 13:26:01.350253 afpd[7324] {ad_conv.c:278} (debug:ad): ad_convert("/mnt/sda1/tm_im"): BEGIN                                                    
May 14 13:26:01.350354 afpd[7324] {ad_conv.c:65} (debug:ad): ad_conv_v22ea_hf("/mnt/sda1/tm_im"): BEGIN                                         
May 14 13:26:01.350438 afpd[7324] {ad_open.c:2018} (error:ad): ad_open_hf_ea: WHAT IS ERRNO HERE: 22                                                                                                                    
May 14 13:26:01.350545 afpd[7324] {ad_open.c:2019} (debug:ad): ad_open("/mnt/sda1/tm_im", HF|DIR|O_RDWR): BEGIN {d: 0, m: 0, r: 0}[dfd: -1 (ref: 0), mfd: -1 (ref: 0), rfd: -1 (ref: 0)]

Not sure I'll find what is setting errno to EINVAL (22) - maybe it's on cnid_dbd.c while trying to configure ports or something given the weirdness of OpenWrt compared to normal servers... but... should it be cleared/initialized somewhere in the early start of ad_open.c?

@APCCV
Copy link
Author

APCCV commented May 14, 2024

Adding an errno = 0 reset to ad_init() in /libatalk/adouble/ad_open.c seems cleaner and behaves equally as well.

void ad_init(struct adouble *ad, const struct vol * restrict vol)
{
    memset(ad, 0, sizeof(struct adouble));
    ad->ad_vers = vol->v_adouble;
    ad->ad_options = vol->v_ad_options;
    ad_init_func(ad);
    errno = 0;
}

That way if something throws an error to errno between init and end of operation it would be handled as code does today.

I appreciate that OpenWrt is a bit of a niche, and that I'm probably one of the last three people ever that might want to use netatalk to support Time Machines on it for pre-SMB, obsolete MacOS on Macs that can't be upgraded. Not asking for a fix unless this is something you think would benefit everyone, but just your opinion on how much of a terrible idea this is. I can add it as a patch to the OpenWrt Makefile and I'm good to go if you think this holds water.

@rdmark
Copy link
Member

rdmark commented May 15, 2024

yes, errno is indeed a global variable. Maybe not the best design decision, but it’s what we have right now. :)

let me take a look closer at your patch tonight and make up my own mind if it’s a good idea or not. To facilitate the code review and triaging, would you mind filing a pull request with the code change?

FWIW you’re not the first person who wants to use netatalk on openwrt… see:

#562

and, at this particular point in history, netatalk is arguably the superior FLOSS choice over Samba for macOS file sharing, if you value transfer speeds. We have a core user base that stick with netatalk for this reason.

@rdmark
Copy link
Member

rdmark commented May 15, 2024

As a side note, there is another workaround you could potentially try: configure netatalk to store meta data in AppleDouble dot directories rather than filesystem EA.

The afp.conf option is ea = ad

see also the manual: https://netatalk.io/stable/htmldocs/afp.conf.5

in fact, I would be curious to know if this bypasses the error or not. Would give us another indicator where the actual issue lies.

@APCCV
Copy link
Author

APCCV commented May 15, 2024

Thanks!
Bunch of things to try so it will take a while.

  • Rebuild main directory and CNID as per Invalid metadata EA errors #236 (comment) and try a vanilla build (halfway there)
  • Try vanilla build with ea = ad
  • Make a PR for the hack above
    I'll get to it eventually. Thanks again!

@rdmark
Copy link
Member

rdmark commented May 16, 2024

Note to self: Get openwrt up and running in a VM for testing. https://openwrt.org/docs/guide-user/virtualization/virtualbox-vm

@APCCV
Copy link
Author

APCCV commented May 16, 2024

Compiling for OpenWrt can be a bit of an art. I suggest you download the x86 SDK.

netatalk is no longer available in the packages directory. I pulled the package definition from version 21.x. I can post the tar file somewhere if you want it (make file, config file, and patches). I've asked to be the new maintainer so it can be put back on.

I'm cross-compiling for an ipq806x, so that is the SDK I am using instead of x86. Should make no difference. SDK works well and allows to build the netatalk package. The package definition and an updated Makefile for the package with the 3.1.18 reference is needed.

Once the package is compiled, scp-ing it to the running vm and running opkg install <filename> should work and should bring in all dependencies like xml2.

There is no UI module for netatalk so all configuration needs to be done via ssh.

Makefile change in package in case you pull the package from 21.x feeds:

#
# Copyright (C) 2009-2013 OpenWrt.org
#
# This is free software, licensed under the GNU General Public License v2.
# See /LICENSE for more information.
#

include $(TOPDIR)/rules.mk

PKG_NAME:=netatalk
PKG_VERSION:=3.1.18
PKG_RELEASE:=3

PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.bz2
PKG_SOURCE_URL:=@SF/netatalk
PKG_HASH:=86d209dc7776a4ba17845164d2e376a469cee376a27b18ae31898e3f4ba41655
   ...

(That's an update to the PKG_HASH, VERSION and RELEASE

@rdmark
Copy link
Member

rdmark commented May 17, 2024

Thanks for the friendly guide. I'll play around with it this coming weekend.

As a side note, since you're aiming to potentially package netatalk for openwrt, we are in the process of transitioning to Meson as the primary build system, rather than traditional autotools and makefiles. A quick googling suggests that openwrt has support for the Meson build system. In short, Meson is a lot faster and simpler to work with. Most configure options are named the same but with a -D prefix rather than --. I'd be happy to give you some hands on support if you want to try it out.!

@APCCV
Copy link
Author

APCCV commented May 17, 2024

Once there is a tarball available that relies on Meson we can throw it at the build system to see what gives. Might need to reformat or rewrite the package's Makefile for it to Meson the netatalk build. Happy to try that out.
I may be able to try the different tests above this weekend. Will post results when I get them. Will get the PR going after I test the first one just to discount this wasn't an environmental thing (which I doubt).
Cheers!

@APCCV
Copy link
Author

APCCV commented May 22, 2024

No difference with ea = ad. Still fails with "Invalid argument" either way and leaves zero-length files with nothing else.

With a patched version ea = ad creates metadata files as expected:

root@OpenWrt:~# ls -la /mnt/sda1/test
drwxrwxrwx    2 root     root          4096 May 22 11:52 .
drwxr-xr-x    7 1000     1000          4096 May 22 11:42 ..
-rw-r--r--    1 timeim   timeim        6148 May 22 11:48 .DS_Store
-rw-rw-rw-    1 timeim   timeim          63 May 22 11:52 ._IMG_4078.jpg::EA
-rw-rw-rw-    1 timeim   timeim          16 May 22 11:52 ._IMG_4078.jpg::EA::com.apple.lastuseddate#PS
-rw-rw-rw-    1 timeim   timeim          21 May 22 11:52 ._IMG_4078.jpg::EA::com.apple.quarantine
-rw-r--r--    1 timeim   timeim     1439398 Nov 19  2023 IMG_4078.jpg

Patching ad_init() to set errno = 0 stopped working. I am unable to make it work again with that change. This makes me think the EINVAL value is being set somewhere else, with probably a race condition involved.

What I ended up doing is patching ad_open_hf_ea() to ignore the error:

@@ -1284,7 +1284,10 @@
         /* It doesnt exist, EPERM or another error */
         if (!(errno == ENOATTR || errno == ENOENT)) {
             LOG(log_error, logtype_ad, "ad_open_hf_ea: unexpected: %s", strerror(errno));
+            /* OpenWrt ends up with EINVAL at this point, no clue why. Log it but keep going. */
+            if (errno != EINVAL) {
+                EC_FAIL;
+            }
-            EC_FAIL;
         }

         /* Create one */

I'll try to track down where that EINVAL is coming from but it will require tons of changes for extra logging so may take a while.

I lost the package definition files required for building in OpenWrt SDK. Don't know what the difference may be but now the build also depends on libevent. That may be part of the reason why the previous patch stopped working. Which also makes me think... maybe OpenWrt is doing something unexpected during configure? I can get a log of what it did and post here if you think it's useful.

@rdmark
Copy link
Member

rdmark commented Jun 2, 2024

@APCCV A stable 3.2.0 release is now available. Do you think it would be possible to upgrade your setup to this version and evaluate how it performs on your system?

This release contains two specific patches that address EA metadata handling, namely: #513 and #575

I would be very curious to learn if you're seeing different outcomes with this version!

@APCCV
Copy link
Author

APCCV commented Jun 11, 2024

@rdmark It works!
No more patch to ignore EINVAL.

Just adjusted the OpenWrt package Makefile to pick up 3.2.0 and it works. That means the build still uses automake and not meson. Can look at that later.

So far I have not been successful to get the package accepted back on OpenWrt. I'll update the PR with this version and try again.

@rdmark
Copy link
Member

rdmark commented Jun 11, 2024

@APCCV Yippie! Thanks for testing, and for working through the issue with us.

As a technical note, Netatalk 3.2.0 is now cleaning up "invalid" EA metadata when it encounters it, so you may see similar messages in the future in rare cases when the metadata is actually somehow bad (from Netatalk's perspective). However, it should only appear once for a file or dir. If you see repeated messages for the same file or dir, it means that something is creating EA metadata that isn't "Netatalk compatible". If this starts occurring for you, please file a new bug here and we'll look into it!

Using the Autotools build system is totally fine, for the lifecycle of the 3.2.x release series. The only thing you miss out on is the bundled SSL provider, which is only relevant if you lack a compatible SSL provider in OpenWRT (namely OpenSSL 1.1 or LibreSSL). But we of course would love you to try out our fancy new build system, because it is more user friendly and blazingly fast. :)

Closing this ticket for now.

@rdmark rdmark closed this as completed Jun 11, 2024
@APCCV
Copy link
Author

APCCV commented Jun 22, 2024

Seem to have a Time Machine backup stuck on cleanup with this endlessly in logs:

Jun 21 13:32:46.354308 afpd[18659] {afp_dsi.c:193} (note:AFPDaemon): afp_dsi_transfer_session: successful primary reconnect                   
Jun 21 13:32:48.354861 afpd[18664] {auth.c:558} (note:AFPDaemon): afp_disconnect: primary reconnect succeeded                                 
Jun 21 13:34:46.606108 afpd[18659] {dsi_stream.c:503} (error:DSI): dsi_stream_read: len:0, unexpected EOF                                     
Jun 21 13:34:46.606281 afpd[18659] {dsi_stream.c:503} (error:DSI): dsi_stream_read: len:0, unexpected EOF                                     
Jun 21 13:34:46.606378 afpd[18659] {dsi_stream.c:259} (note:DSI): dsi_disconnect: entering disconnected state                                 
Jun 21 13:34:49.890858 afpd[18666] {auth.c:228} (note:AFPDaemon): Login by <user> (AFP3.4)                                                    
Jun 21 13:34:49.892028 afpd[18666] {auth.c:533} (note:AFPDaemon): afp_disconnect: trying primary reconnect                                    
Jun 21 13:34:49.892344 afpd[18543] {server_child.c:256} (note:Default): Reconnect: transferring session to child[18659]                       
Jun 21 13:34:49.892582 afpd[18543] {server_ipc.c:248} (note:AFPDaemon): Reconnect: killing new session child[18666] after transfer            

@APCCV
Copy link
Author

APCCV commented Jun 22, 2024

Nvm... it snapped out of it.

@rdmark
Copy link
Member

rdmark commented Jun 23, 2024

If you run into reproducible bugs, don't hesitate to file a new issue ticket, or start a discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants