Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make gpuCI and pre-commit style configurations consistent #8215

Merged
merged 27 commits into from
Jul 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2a7498a
Fix black, flake8 checks, attempt to fix isort
charlesbluca May 11, 2021
1b1a7e3
Fix clang-format checks
charlesbluca May 11, 2021
8eab7ab
Add new isort checks to gpuCI script
charlesbluca May 11, 2021
42ecf31
Add new isort return values to array
charlesbluca May 13, 2021
f2ad3b2
Exclude __init__.py files for isort-cudf
charlesbluca Jun 4, 2021
4cb923b
Bump isort to 5.6.0, include cython/pyi in checks
charlesbluca Jun 4, 2021
00ecc51
Bump isort to 5.6.4
charlesbluca Jun 14, 2021
220a722
Run updated hooks across codebase
charlesbluca Jun 21, 2021
bdb23db
Revert "Run updated hooks across codebase"
charlesbluca Jun 21, 2021
4bc7b69
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 21, 2021
c301e3f
Run hooks again
charlesbluca Jun 21, 2021
6cead3b
Temporary fix for style failures
charlesbluca Jun 22, 2021
a9f555e
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 24, 2021
d34eeb9
Run pre-commit hooks again
charlesbluca Jun 24, 2021
994e770
Retain old import style with isort skip
charlesbluca Jun 24, 2021
88eca4e
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 24, 2021
44399d3
Bump copyright on affected files
charlesbluca Jun 24, 2021
1dc0d9b
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 28, 2021
ab166a0
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 30, 2021
ef6691c
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jun 30, 2021
bb622a9
Run hooks again
charlesbluca Jun 30, 2021
323d3a5
Fix test failures
charlesbluca Jul 2, 2021
02ab8e6
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jul 2, 2021
1655435
Fix merge conflicts
charlesbluca Jul 8, 2021
31a1a23
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jul 8, 2021
3e799bd
Fix merge conflicts
charlesbluca Jul 13, 2021
367e1cd
Merge remote-tracking branch 'upstream/branch-21.08' into consistent-…
charlesbluca Jul 13, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 28 additions & 6 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,23 +1,45 @@
repos:
- repo: https://github.com/timothycrosley/isort
rev: 5.0.7
- repo: https://github.com/pycqa/isort
rev: 5.6.4
hooks:
- id: isort
alias: isort-cudf
name: isort-cudf
args: ["--settings-path=python/cudf/setup.cfg"]
charlesbluca marked this conversation as resolved.
Show resolved Hide resolved
files: python/cudf/.*
exclude: __init__.py$
types: [text]
types_or: [python, cython, pyi]
- id: isort
alias: isort-cudf-kafka
name: isort-cudf-kafka
args: ["--settings-path=python/cudf_kafka/setup.cfg"]
files: python/cudf_kafka/.*
types: [text]
types_or: [python, cython]
- id: isort
alias: isort-custreamz
name: isort-custreamz
args: ["--settings-path=python/custreamz/setup.cfg"]
files: python/custreamz/.*
- id: isort
alias: isort-dask-cudf
name: isort-dask-cudf
args: ["--settings-path=python/dask_cudf/setup.cfg"]
files: python/dask_cudf/.*
- repo: https://github.com/ambv/black
rev: 19.10b0
hooks:
- id: black
files: python/.*
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.3
hooks:
- id: flake8
alias: flake8
name: flake8
args: ["--config=python/.flake8"]
types: [python]
- repo: https://gitlab.com/pycqa/flake8
rev: 3.8.3
hooks:
files: python/.*\.py$
- id: flake8
alias: flake8-cython
name: flake8-cython
Expand Down
68 changes: 54 additions & 14 deletions ci/checks/style.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,41 +14,77 @@ LANG=C.UTF-8
. /opt/conda/etc/profile.d/conda.sh
conda activate rapids

# Run isort and get results/return code
ISORT=`isort --check-only python/**/*.py`
ISORT_RETVAL=$?
# Run isort-cudf and get results/return code
ISORT_CUDF=`isort python/cudf --check-only --skip-glob *.pyx --settings-path=python/cudf/setup.cfg 2>&1`
ISORT_CUDF_RETVAL=$?

# Run isort-cudf-kafka and get results/return code
ISORT_CUDF_KAFKA=`isort python/cudf_kafka --check-only --settings-path=python/cudf_kafka/setup.cfg 2>&1`
ISORT_CUDF_KAFKA_RETVAL=$?

# Run isort-custreamz and get results/return code
ISORT_CUSTREAMZ=`isort python/custreamz --check-only --settings-path=python/custreamz/setup.cfg 2>&1`
ISORT_CUSTREAMZ_RETVAL=$?

# Run isort-dask-cudf and get results/return code
ISORT_DASK_CUDF=`isort python/dask_cudf --check-only --settings-path=python/dask_cudf/setup.cfg 2>&1`
ISORT_DASK_CUDF_RETVAL=$?

# Run black and get results/return code
BLACK=`black --check python`
BLACK=`black --check python 2>&1`
BLACK_RETVAL=$?

# Run flake8 and get results/return code
FLAKE=`flake8 --config=python/.flake8 python`
FLAKE=`flake8 --config=python/.flake8 python 2>&1`
FLAKE_RETVAL=$?

# Run flake8-cython and get results/return code
FLAKE_CYTHON=`flake8 --config=python/.flake8.cython`
FLAKE_CYTHON=`flake8 --config=python/.flake8.cython 2>&1`
FLAKE_CYTHON_RETVAL=$?

# Run mypy and get results/return code
MYPY_CUDF=`mypy --config=python/cudf/setup.cfg python/cudf/cudf`
MYPY_CUDF=`mypy --config=python/cudf/setup.cfg python/cudf/cudf 2>&1`
MYPY_CUDF_RETVAL=$?

# Run pydocstyle and get results/return code
PYDOCSTYLE=`pydocstyle --config=python/.flake8 python`
PYDOCSTYLE=`pydocstyle --config=python/.flake8 python 2>&1`
PYDOCSTYLE_RETVAL=$?

# Run clang-format and check for a consistent code format
CLANG_FORMAT=`python cpp/scripts/run-clang-format.py 2>&1`
CLANG_FORMAT_RETVAL=$?

# Output results if failure otherwise show pass
if [ "$ISORT_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort style check; begin output\n\n"
echo -e "$ISORT"
echo -e "\n\n>>>> FAILED: isort style check; end output\n\n"
if [ "$ISORT_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-cudf style check; begin output\n\n"
echo -e "$ISORT_CUDF"
echo -e "\n\n>>>> FAILED: isort-cudf style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-cudf style check\n\n"
fi

if [ "$ISORT_CUDF_KAFKA_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-cudf-kafka style check; begin output\n\n"
echo -e "$ISORT_CUDF_KAFKA"
echo -e "\n\n>>>> FAILED: isort-cudf-kafka style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-cudf-kafka style check\n\n"
fi

if [ "$ISORT_CUSTREAMZ_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-custreamz style check; begin output\n\n"
echo -e "$ISORT_CUSTREAMZ"
echo -e "\n\n>>>> FAILED: isort-custreamz style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort-custreamz style check\n\n"
fi

if [ "$ISORT_DASK_CUDF_RETVAL" != "0" ]; then
echo -e "\n\n>>>> FAILED: isort-dask-cudf style check; begin output\n\n"
echo -e "$ISORT_DASK_CUDF"
echo -e "\n\n>>>> FAILED: isort-dask-cudf style check; end output\n\n"
else
echo -e "\n\n>>>> PASSED: isort style check\n\n"
echo -e "\n\n>>>> PASSED: isort-dask-cudf style check\n\n"
fi

if [ "$BLACK_RETVAL" != "0" ]; then
Expand Down Expand Up @@ -104,7 +140,11 @@ HEADER_META=`ci/checks/headers_test.sh`
HEADER_META_RETVAL=$?
echo -e "$HEADER_META"

RETVALS=($ISORT_RETVAL $BLACK_RETVAL $FLAKE_RETVAL $FLAKE_CYTHON_RETVAL $PYDOCSTYLE_RETVAL $CLANG_FORMAT_RETVAL $HEADER_META_RETVAL $MYPY_CUDF_RETVAL)
RETVALS=(
$ISORT_CUDF_RETVAL $ISORT_CUDF_KAFKA_RETVAL $ISORT_CUSTREAMZ_RETVAL $ISORT_DASK_CUDF_RETVAL
$BLACK_RETVAL $FLAKE_RETVAL $FLAKE_CYTHON_RETVAL $PYDOCSTYLE_RETVAL $CLANG_FORMAT_RETVAL
$HEADER_META_RETVAL $MYPY_CUDF_RETVAL
)
IFS=$'\n'
RETVAL=`echo "${RETVALS[*]}" | sort -nr | head -n1`

Expand Down
18 changes: 9 additions & 9 deletions cpp/libcudf_kafka/include/cudf_kafka/kafka_consumer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class kafka_consumer : public cudf::io::datasource {
* @param configs key/value pairs of librdkafka configurations that will be
* passed to the librdkafka client
*/
kafka_consumer(std::map<std::string, std::string> const &configs);
kafka_consumer(std::map<std::string, std::string> const& configs);

/**
* @brief Instantiate a Kafka consumer object. Documentation for librdkafka configurations can be
Expand All @@ -66,13 +66,13 @@ class kafka_consumer : public cudf::io::datasource {
* before batch_timeout, a smaller subset will be returned
* @param delimiter optional delimiter to insert into the output between kafka messages, Ex: "\n"
*/
kafka_consumer(std::map<std::string, std::string> const &configs,
std::string const &topic_name,
kafka_consumer(std::map<std::string, std::string> const& configs,
std::string const& topic_name,
int partition,
int64_t start_offset,
int64_t end_offset,
int batch_timeout,
std::string const &delimiter);
std::string const& delimiter);

/**
* @brief Returns a buffer with a subset of data from Kafka Topic
Expand Down Expand Up @@ -100,7 +100,7 @@ class kafka_consumer : public cudf::io::datasource {
*
* @return The number of bytes read (can be smaller than size)
*/
size_t host_read(size_t offset, size_t size, uint8_t *dst) override;
size_t host_read(size_t offset, size_t size, uint8_t* dst) override;

/**
* @brief Commits an offset to a specified Kafka Topic/Partition instance
Expand All @@ -112,7 +112,7 @@ class kafka_consumer : public cudf::io::datasource {
* @param[in] offset Offset that should be set for the topic/partition pair
*
*/
void commit_offset(std::string const &topic, int partition, int64_t offset);
void commit_offset(std::string const& topic, int partition, int64_t offset);

/**
* @brief Retrieve the watermark offset values for a topic/partition
Expand All @@ -124,7 +124,7 @@ class kafka_consumer : public cudf::io::datasource {
* the latest value will be retrieved from the Kafka broker by making a network
* request.
*/
std::map<std::string, int64_t> get_watermark_offset(std::string const &topic,
std::map<std::string, int64_t> get_watermark_offset(std::string const& topic,
int partition,
int timeout,
bool cached);
Expand All @@ -144,7 +144,7 @@ class kafka_consumer : public cudf::io::datasource {
*
* @return Latest offset for the specified topic/partition pair
*/
int64_t get_committed_offset(std::string const &topic, int partition);
int64_t get_committed_offset(std::string const& topic, int partition);

/**
* @brief Query the Kafka broker for the list of Topic partitions for a Topic. If no topic is
Expand Down Expand Up @@ -189,7 +189,7 @@ class kafka_consumer : public cudf::io::datasource {
std::string buffer;

private:
RdKafka::ErrorCode update_consumer_topic_partition_assignment(std::string const &topic,
RdKafka::ErrorCode update_consumer_topic_partition_assignment(std::string const& topic,
int partition,
int64_t offset);

Expand Down
46 changes: 23 additions & 23 deletions cpp/libcudf_kafka/src/kafka_consumer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ namespace io {
namespace external {
namespace kafka {

kafka_consumer::kafka_consumer(std::map<std::string, std::string> const &configs)
kafka_consumer::kafka_consumer(std::map<std::string, std::string> const& configs)
: kafka_conf(RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL))
{
for (auto const &key_value : configs) {
for (auto const& key_value : configs) {
std::string error_string;
CUDF_EXPECTS(RdKafka::Conf::ConfResult::CONF_OK ==
kafka_conf->set(key_value.first, key_value.second, error_string),
Expand All @@ -44,13 +44,13 @@ kafka_consumer::kafka_consumer(std::map<std::string, std::string> const &configs
RdKafka::KafkaConsumer::create(kafka_conf.get(), errstr));
}

kafka_consumer::kafka_consumer(std::map<std::string, std::string> const &configs,
std::string const &topic_name,
kafka_consumer::kafka_consumer(std::map<std::string, std::string> const& configs,
std::string const& topic_name,
int partition,
int64_t start_offset,
int64_t end_offset,
int batch_timeout,
std::string const &delimiter)
std::string const& delimiter)
: topic_name(topic_name),
partition(partition),
start_offset(start_offset),
Expand All @@ -60,7 +60,7 @@ kafka_consumer::kafka_consumer(std::map<std::string, std::string> const &configs
{
kafka_conf = std::unique_ptr<RdKafka::Conf>(RdKafka::Conf::create(RdKafka::Conf::CONF_GLOBAL));

for (auto const &key_value : configs) {
for (auto const& key_value : configs) {
std::string error_string;
CUDF_EXPECTS(RdKafka::Conf::ConfResult::CONF_OK ==
kafka_conf->set(key_value.first, key_value.second, error_string),
Expand All @@ -85,10 +85,10 @@ std::unique_ptr<cudf::io::datasource::buffer> kafka_consumer::host_read(size_t o
{
if (offset > buffer.size()) { return 0; }
size = std::min(size, buffer.size() - offset);
return std::make_unique<non_owning_buffer>((uint8_t *)buffer.data() + offset, size);
return std::make_unique<non_owning_buffer>((uint8_t*)buffer.data() + offset, size);
}

size_t kafka_consumer::host_read(size_t offset, size_t size, uint8_t *dst)
size_t kafka_consumer::host_read(size_t offset, size_t size, uint8_t* dst)
{
if (offset > buffer.size()) { return 0; }
auto const read_size = std::min(size, buffer.size() - offset);
Expand All @@ -102,9 +102,9 @@ size_t kafka_consumer::size() const { return buffer.size(); }
* Change the TOPPAR assignment for this consumer instance
*/
RdKafka::ErrorCode kafka_consumer::update_consumer_topic_partition_assignment(
std::string const &topic, int partition, int64_t offset)
std::string const& topic, int partition, int64_t offset)
{
std::vector<RdKafka::TopicPartition *> topic_partitions;
std::vector<RdKafka::TopicPartition*> topic_partitions;
topic_partitions.push_back(RdKafka::TopicPartition::create(topic, partition, offset));
return consumer.get()->assign(topic_partitions);
}
Expand All @@ -121,7 +121,7 @@ void kafka_consumer::consume_to_buffer()
consumer->consume((end - std::chrono::steady_clock::now()).count())};

if (msg->err() == RdKafka::ErrorCode::ERR_NO_ERROR) {
buffer.append(static_cast<char *>(msg->payload()));
buffer.append(static_cast<char*>(msg->payload()));
buffer.append(delimiter);
messages_read++;
} else if (msg->err() == RdKafka::ErrorCode::ERR__PARTITION_EOF) {
Expand All @@ -134,15 +134,15 @@ void kafka_consumer::consume_to_buffer()
std::map<std::string, std::string> kafka_consumer::current_configs()
{
std::map<std::string, std::string> configs;
std::list<std::string> *dump = kafka_conf->dump();
std::list<std::string>* dump = kafka_conf->dump();
for (auto it = dump->begin(); it != dump->end(); std::advance(it, 2))
configs.insert({*it, *std::next(it)});
return configs;
}

int64_t kafka_consumer::get_committed_offset(std::string const &topic, int partition)
int64_t kafka_consumer::get_committed_offset(std::string const& topic, int partition)
{
std::vector<RdKafka::TopicPartition *> toppar_list;
std::vector<RdKafka::TopicPartition*> toppar_list;
toppar_list.push_back(RdKafka::TopicPartition::create(topic, partition));

// Query Kafka to populate the TopicPartitions with the desired offsets
Expand All @@ -160,7 +160,7 @@ std::map<std::string, std::vector<int32_t>> kafka_consumer::list_topics(std::str
auto spec_topic = std::unique_ptr<RdKafka::Topic>(
RdKafka::Topic::create(consumer.get(), specific_topic, nullptr, errstr));

RdKafka::Metadata *md;
RdKafka::Metadata* md;
CUDF_EXPECTS(
RdKafka::ERR_NO_ERROR ==
consumer->metadata(spec_topic == nullptr, spec_topic.get(), &md, default_timeout),
Expand All @@ -169,19 +169,19 @@ std::map<std::string, std::vector<int32_t>> kafka_consumer::list_topics(std::str
}();
std::map<std::string, std::vector<int32_t>> topic_parts;

for (auto const &topic : *(metadata->topics())) {
auto &part_ids = topic_parts[topic->topic()];
auto const &parts = *(topic->partitions());
for (auto const& topic : *(metadata->topics())) {
auto& part_ids = topic_parts[topic->topic()];
auto const& parts = *(topic->partitions());
std::transform(
parts.cbegin(), parts.cend(), std::back_inserter(part_ids), [](auto const &part) {
parts.cbegin(), parts.cend(), std::back_inserter(part_ids), [](auto const& part) {
return part->id();
});
}

return topic_parts;
}

std::map<std::string, int64_t> kafka_consumer::get_watermark_offset(std::string const &topic,
std::map<std::string, int64_t> kafka_consumer::get_watermark_offset(std::string const& topic,
int partition,
int timeout,
bool cached)
Expand Down Expand Up @@ -212,10 +212,10 @@ std::map<std::string, int64_t> kafka_consumer::get_watermark_offset(std::string
return results;
}

void kafka_consumer::commit_offset(std::string const &topic, int partition, int64_t offset)
void kafka_consumer::commit_offset(std::string const& topic, int partition, int64_t offset)
{
std::vector<RdKafka::TopicPartition *> partitions_;
RdKafka::TopicPartition *toppar = RdKafka::TopicPartition::create(topic, partition, offset);
std::vector<RdKafka::TopicPartition*> partitions_;
RdKafka::TopicPartition* toppar = RdKafka::TopicPartition::create(topic, partition, offset);
CUDF_EXPECTS(toppar != nullptr, "RdKafka failed to create TopicPartition");
toppar->set_offset(offset);
partitions_.push_back(toppar);
Expand Down
4 changes: 2 additions & 2 deletions cpp/libcudf_kafka/tests/kafka_consumer_tests.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -20,8 +20,8 @@
#include <string>
#include "cudf_kafka/kafka_consumer.hpp"

#include <cudf/io/datasource.hpp>
#include <cudf/io/csv.hpp>
#include <cudf/io/datasource.hpp>
charlesbluca marked this conversation as resolved.
Show resolved Hide resolved

namespace kafka = cudf::io::external::kafka;

Expand Down
Loading