Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fileid of oc_filecache in danger of hitting max value #26901

Closed
schnidrig opened this issue Jan 5, 2017 · 40 comments
Closed

fileid of oc_filecache in danger of hitting max value #26901

schnidrig opened this issue Jan 5, 2017 · 40 comments
Assignees
Labels
blue-ticket p1-urgent Critical issue, need to consider hotfix with just that issue sev1-critical Type:Bug
Milestone

Comments

@schnidrig
Copy link
Contributor

schnidrig commented Jan 5, 2017

Database

5 node galera cluster with mariadb.
maxscale with read/write splitter.

Expected behaviour

virtually unlimited lifetime of oc service

Actual behaviour

the primary key of the filecache table is in danger of overflowing.
At that point our oc instance will refuse any addition of new files.

Problem description

MariaDB [owncloud]> show create table oc_filecache\G
*************************** 1. row ***************************
	   Table: oc_filecache
Create Table: CREATE TABLE `oc_filecache` (
  `fileid` int(11) NOT NULL AUTO_INCREMENT,
  `storage` int(11) NOT NULL DEFAULT '0',
  `path` varchar(4000) COLLATE utf8_bin DEFAULT NULL,
  `path_hash` varchar(32) COLLATE utf8_bin NOT NULL DEFAULT '',
  `parent` int(11) NOT NULL DEFAULT '0',
  `name` varchar(250) COLLATE utf8_bin DEFAULT NULL,
  `mimetype` int(11) NOT NULL DEFAULT '0',
  `mimepart` int(11) NOT NULL DEFAULT '0',
  `size` bigint(20) NOT NULL DEFAULT '0',
  `mtime` int(11) NOT NULL DEFAULT '0',
  `storage_mtime` int(11) NOT NULL DEFAULT '0',
  `encrypted` int(11) NOT NULL DEFAULT '0',
  `unencrypted_size` bigint(20) NOT NULL DEFAULT '0',
  `etag` varchar(40) COLLATE utf8_bin DEFAULT NULL,
  `permissions` int(11) DEFAULT '0',
  PRIMARY KEY (`fileid`),
  UNIQUE KEY `fs_storage_path_hash` (`storage`,`path_hash`),
  KEY `fs_parent_name_hash` (`parent`,`name`),
  KEY `fs_storage_mimetype` (`storage`,`mimetype`),
  KEY `fs_storage_mimepart` (`storage`,`mimepart`),
  KEY `fs_storage_size` (`storage`,`size`,`fileid`)
) ENGINE=InnoDB AUTO_INCREMENT=252069614 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
1 row in set (0.00 sec)

The primary key is a signed integer with max value of 2147483647.

MariaDB [owncloud]> select max(fileid), count(fileid) from oc_filecache;
+-------------+---------------+
| max(fileid) | count(fileid) |
+-------------+---------------+
|   252065705 |      60772002 |
+-------------+---------------+

We have 60 mio files and the fileid is currently at 250 mio.
4 months ago that counter was at 146 mio. During the last 4 months we had a 3 node galera cluster.
That means 100 Mio increase in 4 months; with our new 5 node cluster instead of a old 3 nodes cluster that would have been a: 100/3*5 = 170 mio increase. Therefore we'll hit the 2147483647 boundary in about 3.5 years with the current user base and setup. However, given the growth rate of our service, I'd estimate that to be more something like 2 years at best.

Considering, that enterprise customers (at least those I know of) usually are one year behind the current release, I think it is about time to fix this in the db.

Please consider changing the fileid to BIGINT. Or at the very least make it unsigned.

More of a hack than a workaround would be to change the auto_increment settings.

MariaDB [owncloud]> show variables like '%auto_increment%';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| auto_increment_increment     | 5     |
| auto_increment_offset        | 1     |
| wsrep_auto_increment_control | ON    |
+------------------------------+-------+

But of course, that would give me tons of other head aches.

@PVince81
Copy link
Contributor

PVince81 commented Jan 9, 2017

Uh oh...

@butonic @DeepDiver1975 @PhilippSchaffrath

@PVince81 PVince81 added this to the 10.0 milestone Jan 9, 2017
@PVince81
Copy link
Contributor

PVince81 commented Jan 9, 2017

@butonic this is the wrong ticket to discuss this, it will likely get lost. Maybe post here instead: #12487 (or find or create a matching ticket).

Yes for using bigint for fileid if that can fix the issue in this one ticket.

@Helios07
Copy link

Helios07 commented Jan 9, 2017

It is even worse for us:

MariaDB [rwth_aachen]> select max(fileid), count(fileid) from oc_filecache;
+-------------+---------------+
| max(fileid) | count(fileid) |
+-------------+---------------+
|   529100321 |      51808811 |
+-------------+---------------+

(and we split our users to different instances already)

@PVince81
Copy link
Contributor

PVince81 commented Jan 9, 2017

We usually don't change DB schemas in a minor version, but might need to do an exception here... @felixboehm @DeepDiver1975 @butonic

I suggest to provide a PR for a quickfix that changes the column type to bigint (not sure about performance impact here during upgrade...) and address @butonic's other points separately.

@davidjericho
Copy link

This one hadn't crossed my mind, but it's definitely a problem for us too.

+-------------+---------------+
| max(fileid) | count(fileid) |
+-------------+---------------+
|   676222618 |      45597155 |
+-------------+---------------+

It'll be a huge problem with our planned growth this year.

@butonic
Copy link
Member

butonic commented Jan 31, 2017

Um ... this will require us to alter the sharing table and probably other tables that reference the fileid ... foreign keys anyone?

@PVince81
Copy link
Contributor

Ref: foreign keys #13143

Indeed, not only the oc_filecache will be affected.

Other tables that come to mind:

  • oc_vcategory_to_object which can use fileid as the object id
  • oc_share, the file_source and item_source columns
  • oc_properties will use fileid in the future: oc_properties must use fileid instead of path #25563
  • oc_systemtag_object_mapping
  • oc_mounts.root_id
  • oc_comments.object_id
  • activity table also uses fileid somewhere
  • ...

@PVince81
Copy link
Contributor

Need to also evaluate the impact of bigger values on the clients especially Javascript.

@PVince81
Copy link
Contributor

Or use strings for ids: #22330. Not sure about the performance impact with indices though.

@butonic
Copy link
Member

butonic commented Apr 24, 2017

just using bigint in dhe db is not enuogh. we neet to check if the code does cast to int, eg: https://github.com/owncloud/core/blob/master/lib/private/Files/Cache/Cache.php#L145

+1 for strings for ids and get this right.

@PVince81 PVince81 modified the milestones: triage, 10.0 May 15, 2017
@PVince81
Copy link
Contributor

setting to triage, needs proper design + scheduling

@pmaier1 @felixboehm

@wschaft
Copy link

wschaft commented Jul 9, 2017

Uh oh...

We need a fix ASAP:

MariaDB [owncloud]> select max(fileid), count(fileid) from oc_filecache;
+-------------+---------------+
| max(fileid) | count(fileid) |
+-------------+---------------+
|  2147483647 |     160067265 |
+-------------+---------------+

@PVince81
Copy link
Contributor

PVince81 commented Jul 9, 2017

Reproduction steps:

  1. Setup OC (I used 8.2 locally)
  2. insert a dummy entry in the DB with a high fileid, using a non-existing storage id: insert into oc_filecache (fileid, storage, path) values (2147483646, 60000, '');
  3. try creating a few folders or uploading a few files to make it exceed max int

Expected: no limit
Actual:

An exception occurred while executing 'INSERT INTO `oc_filecache` (`mimepart`,`mimetype`,`mtime`,`size`,`etag`,`storage_mtime`,`permissions`,`parent`,`path_hash`,`path`,`name`,`storage`) SELECT ?,?,?,?,?,?,?,?,?,?,?,? FROM `oc_filecache` WHERE `storage` = ? AND `path_hash` = ? HAVING COUNT(*) = 0' with params [\"1\", \"2\", 1499626958, -1, \"59627dce53dca\", 1499626958, 31, \"2\", \"f7723ab8c8e52c6c0711e88e739400f3\", \"files\\\/x\", \"x\", \"1\", \"1\", \"f7723ab8c8e52c6c0711e88e739400f3\"]:\n\nSQLSTATE[22003]: Numeric value out of range: 167 Out of range value for column 'fileid' at row 1

@cdamken cdamken added green-ticket p1-urgent Critical issue, need to consider hotfix with just that issue sev1-critical and removed blue-ticket sev2-high labels Jul 9, 2017
@PVince81
Copy link
Contributor

PVince81 commented Jul 9, 2017

Apparently file/folder entries get created on-disk but not in DB. This means that after switching to bigint, will need to run occ files:scan --all to make it detect all the created "ghost" files that weren't indexed.

@PVince81
Copy link
Contributor

PVince81 commented Jul 9, 2017

@butonic I checked with a debugger, reading a bigint value from DB with PHP at the code in https://github.com/owncloud/core/blob/master/lib/private/Files/Cache/Cache.php#L140.

The debugger shows that the fileid stayed the same for a value of 2147483659.

@PVince81
Copy link
Contributor

PVince81 commented Aug 3, 2017

@VicDeo how far are we from completion ? Have the backports to stable10 been done everywhere ?

@VicDeo
Copy link
Member

VicDeo commented Aug 3, 2017

Phase 1
core

activity

files_antivirus

search_lucene

search_elastic


database.xml adjusted, but this has no effect if the app was installed before this change

Need porting to migrations:
richdocuments

music

@PVince81
Copy link
Contributor

PVince81 commented Aug 7, 2017

We still need to think of:

⚠️ make a decision and sort out the 32-bit system issue

Apparently some things already broke on 32-bit systems due to casting of the size column which was already a bigint a long time ago: #28275

@PVince81
Copy link
Contributor

PVince81 commented Aug 7, 2017

Had a chat with @felixboehm and we'll only make sure that the bigint changes do not break OC completely on 32-bit systems for smaller values. However they will be no guarantee that values bigger than 32-bit are properly supported. So people with Rasperry Pis should still be able to use ownCloud fine assuming they don't scale to huge numbers.

Raised QA ticket to retest 10.0.3 on 32-bit here owncloud/QA#476

@PVince81
Copy link
Contributor

@VicDeo are we done with phase 1?

@VicDeo
Copy link
Member

VicDeo commented Aug 23, 2017

@PVince81 in general yes, but richdocuments and music should be ported to migrations for completion.
And probably branching is needed to keep non-bigintified versions that are compatible with OC below 10.0.3

@PVince81
Copy link
Contributor

moving to "planned" for second phase.

@PVince81 PVince81 modified the milestones: planned, development Aug 30, 2017
@ckamm
Copy link

ckamm commented Sep 5, 2017

@PVince81 What's the effect on the full id of this change? Currently 'getFileId()' in apps/dav/lib/Connector/Sabre/Node.php still does $id = sprintf('%08d', $this->info->getId()); - so I assume the zero-padding to 8 digits continues, but full ids will now increase in width when the numeric id exceeds 8 digits.

This works for the desktop client (takes all digits until first non-numeric character).

@PVince81
Copy link
Contributor

PVince81 commented Sep 5, 2017

@ckamm we should probably get rid of that weird compound id with padding and move to "oc:id" and have the instance id part exposed through the capabilities API ? (separate ticket)

@phil-davis
Copy link
Contributor

phil-davis commented Sep 5, 2017

8 digits is only 99,999,999 = 100 million minus 1. 32 bit numbers anyway go to around 2,000,000,000 (signed) or 4,000,000,000 (unsigned) and thus would already be using 9 or 10 digits anyway (for installs with > 99,999,999 ids). So the 8-digit thing should be an existing "feature" regardless of 64-bit changes.

@ckamm
Copy link

ckamm commented Sep 5, 2017

@phil-davis Good point.

@PVince81 Currently oc:id is the compound id and oc:fileid is the numeric one (opposite to the code in Node.php). We could start deprecating oc:id, the cost would be significant migration work in the client. I agree about moving this discussion to a separate ticket.

The bigint change looks good from the desktop client's point of view.

@PVince81
Copy link
Contributor

PVince81 commented Nov 3, 2017

Raised #29437 to follow up with phase 2 and other ids.

All the main ids (fileids and activity table ids) have been adjusted already, closing.

@lock
Copy link

lock bot commented Jul 31, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators Jul 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
blue-ticket p1-urgent Critical issue, need to consider hotfix with just that issue sev1-critical Type:Bug
Projects
None yet
Development

No branches or pull requests