Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong DateTime insert #20

Closed
v-v-vishnevskiy opened this issue Jan 19, 2018 · 14 comments
Closed

Wrong DateTime insert #20

v-v-vishnevskiy opened this issue Jan 19, 2018 · 14 comments
Assignees

Comments

@v-v-vishnevskiy
Copy link

Hi!

After insert datetime.datetime(2018, 1, 19, 10) through this driver I see '2018-01-19 13:00:00' value in table.
Timezone on my computer and clickhouse server is Moskow.

What I must do to see '2018-01-19 10:00:00' after insert?

@xzkostyan
Copy link
Member

Can you provide sequence of CREATE TABLE/INSERT/SELECT operations in clickhouse-client and in clickhouse-driver?

@v-v-vishnevskiy
Copy link
Author

v-v-vishnevskiy commented Jan 19, 2018

clickhouse_driver_test.py

import datetime
import requests
import clickhouse_driver
from clickhouse_driver.errors import Error

q_create = """
create table default.test_tbl (
      event_date Date,
      created_ts DateTime,
      id UInt64
    ) ENGINE = MergeTree(event_date, (event_date), 8192);
"""

q_insert = "INSERT INTO test_tbl (event_date,created_ts,id) VALUES"

config = {
    "host": "localhost",
    "port": 9111,
    "compression": "zstd"
}

connection = clickhouse_driver.Client(**config)

# create table
try:
    connection.execute("DESCRIBE default.test_tbl format JSON")
except Error as e:
    if e.code == 60:
        print("Creating table...")
        connection.execute(q_create)
    else:
        print(str(e))
        raise SystemExit(1)


# insert data
data = [(datetime.date(2018, 1, 19), datetime.datetime.utcnow(), 1), (datetime.date(2018, 1, 19), datetime.datetime.now(), 1)]
print("Insert data: ", q_insert, data)
connection.execute(q_insert, data)


# select
result = requests.get("http://localhost:8123/?query=SELECT * FROM test_tbl format JSON;")
if result.status_code == 200:
    print("\nHTTP Response:\n", result.text)
else:
    print("HTTP request error")

print("TCP Response:")
for row in connection.execute("SELECT * FROM test_tbl"):
    print(row)

My output:

Creating table...
Insert data:  INSERT INTO test_tbl (event_date,dt_now,dt_utcnow,id) VALUES [(datetime.date(2018, 1, 19), datetime.datetime(2018, 1, 19, 17, 14, 33, 704224), datetime.datetime(2018, 1, 19, 14, 14, 33, 704232), 1)]

HTTP Response:
 {
	"meta":
	[
		{
			"name": "event_date",
			"type": "Date"
		},
		{
			"name": "dt_now",
			"type": "DateTime"
		},
		{
			"name": "dt_utcnow",
			"type": "DateTime"
		},
		{
			"name": "id",
			"type": "UInt64"
		}
	],

	"data":
	[
		{
			"event_date": "2018-01-19",
			"dt_now": "2018-01-19 20:14:33",
			"dt_utcnow": "2018-01-19 17:14:33",
			"id": "1"
		}
	],

	"rows": 1,

	"statistics":
	{
		"elapsed": 0.000537618,
		"rows_read": 1,
		"bytes_read": 18
	}
}

TCP Response:
(datetime.date(2018, 1, 19), datetime.datetime(2018, 1, 19, 17, 14, 33), datetime.datetime(2018, 1, 19, 14, 14, 33), 1)

@xzkostyan
Copy link
Member

It seems that this is driver's SELECT issue.

From ClickHouse docs:

Date with time. Stored in four bytes as a Unix timestamp (unsigned).

The date with time is converted from text (divided into component parts) to binary and back, using the system's time zone at the time the client or server starts. In text format, information about daylight savings is lost.

By default, the client switches to the timezone of the server when it connects. You can change this behavior by enabling the client command-line option --use_client_time_zone.

On INSERT datetime converts into timestamp. On SELECT timestamp converted into datetime with UTC timezone now. DateTime should be constructed using server or client timezone, not UTC.

You can switch to UTC timezone as temporary solution.

@xzkostyan xzkostyan self-assigned this Jan 22, 2018
@xzkostyan
Copy link
Member

@v-v-vishnevskiy
Copy link
Author

Another question: what for you call timegm function in this?

@xzkostyan
Copy link
Member

DateTime is stored as seconds since epoch, so we need to call calendar.timegm or time.mktime for it.

@Thordin
Copy link

Thordin commented Mar 5, 2018

Hello I propose this fix for this problem.

https://gist.github.com/Thordin/c7442eb21839543e073ce9f2ca53dfd4

Any datetime with no timezone will be considered localtime and then changed to UTC when inserting into clickhouse.

When you select datetime from database, it will be converted from UTC to the client's local timezone.

This way you will get the same datetime as you inserted. If your datetime is in a different timezone, you can set the timezone and it will be saved as UTC correctly.

# setting timezone for now()
datetime.datetime.now(tz=some_time_zone)

# add timezone to datetime with no tz info
dt.replace(tzinfo=some_time_zone)

# datetime is in utc
dt.replace(tzinfo=datetime.timezone.utc)

This fix will require the library dateutils. There is no other way to get local timezone without another library.

You can maybe add an option to change the local timezone to something different from the client.

@xzkostyan
Copy link
Member

Hi, @Thordin.

This fix looks pretty good. But ClickHouse server has it's own timezone: <timezone>Europe/Moscow</timezone>.

I think we need to use server's timezone instead of using UTC tz=self.utc. Furthermore, we need to take into consideration clickhouse-client's use_client_time_zone flag. So, each datetime should be converted to server's timezone if use_client_time_zone doesn't specified.

What do you think?

@Thordin
Copy link

Thordin commented Mar 12, 2018

I am not an expert in clickhouse, but I think the server stores the time in UTC no matter what timezone the server has. I think the server timezone is only for the client to display and convert to UTC. I am not sure, maybe you can double check.

My clickhouse server timezone is not GMT.

SELECT timezone()

┌─timezone()──────────┐
│ America/Los_Angeles │
└─────────────────────┘

But when I insert a timezone in clickhouse-client by doing this... and get it with clickhouse-driver, it is UTC.

In clickhouse-client
CREATE TABLE default.test ( date Date,  dt DateTime) ENGINE = MergeTree(date, date, 8192)
INSERT INTO test (date,dt) VALUES (today(), now());

In Python with my date fix, but not datetime fix
db.execute('select * from test')
(datetime.date(2018, 3, 11), datetime.datetime(2018, 3, 12, 5, 10, 3)

So I don't think you should store the datetime in the server's timezone. But you should double check, maybe I made a mistake. But for me, my clickhouse server stores data in UTC even when the timezone is set to local, at least when clickhouse-driver reads it. That's why v-v-vishnevskiy discovered this problem.

@sergunSpb
Copy link

Hi @xzkostyan ,
i have same issue.

I think we need to use server's timezone instead of using UTC tz=self.utc. Furthermore, we need to take into consideration clickhouse-client's use_client_time_zone flag. So, each datetime should be converted to server's timezone if use_client_time_zone doesn't specified.

I think it will help.

@xzkostyan
Copy link
Member

Hi.

I'm working on this issue. Fix will be ready soon.

@xzkostyan
Copy link
Member

Hi,

Here is commit with timezone support: f645ead.

You can test it on latest master:

pip install git+https://github.com/mymarilyn/clickhouse-driver@master#egg=clickhouse-driver

@sergunSpb
Copy link

sergunSpb commented May 14, 2018

Hi,
in my case all works properly now.
Thanks!

@xzkostyan
Copy link
Member

Version 0.0.11 is released. It contains this issue fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants