Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enables production data from CSV file and fixes failing probability distribution test #425

Merged
merged 9 commits into from
Aug 6, 2021
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ This project adheres to [Semantic Versioning](https://semver.org/).
## Unreleased

### Added
- [#417] (https://github.com/equinor/flownet/pull/417) Added functionality to history match dissolved salts (TDS) in produced water.
- [#425](https://github.com/equinor/flownet/pull/425) Added functionality to load production data from CSV file. Changed position of observation `vectors` entry in configuration file (now one level higher, same level as `simulation` and `database` in `data_source` entry).
- [#417](https://github.com/equinor/flownet/pull/417) Added functionality to history match dissolved salts (TDS) in produced water.
- [#404](https://github.com/equinor/flownet/pull/404) Added possibility for regional multipliers for permeability, porosity and bulkvolume multiplier. Current implementation allows for defining either one global multiplier, or a regional multipliers based on a region parameter extracted from an existing simulation model (typically FIPNUM, EQLNUM, SATNUM etc). The regional multiplier will be in addition to the per tube multipliers. New keys in config yaml are: porosity_regional_scheme (global, individual or regions_from_sim), porosity_regional (define prior same way as for other model parameters) and porosity_parameter_from_sim_model (name of region parameter in simulation model). The same three keys exists for permeability and bulkvolume_mult.
- [#383](https://github.com/equinor/flownet/pull/383) Added option to either define a prior distribution for KRWMAX directly by using krwmax in the config yaml, or to let KRWMAX be calculated as KRWEND + delta. To do the latter, set krwmax_add_to_krwend to true, and then the prior distribution definition in the config yaml for krwmax will be interpreted as a prior distribution for the delta value to be added to KRWEND to get the KRWMAX.
- [#386](https://github.com/equinor/flownet/pull/386) Expose FlowNet timeout to user.
Expand Down
15 changes: 14 additions & 1 deletion docs/configuration_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ Example of the entire flownet part of the configuration yaml file:

flownet:
data_source:
database:
input_data: ../input_data/norne_production_data.csv
simulation:
input_case: ../input_model/norne/NORNE_ATW2013
vectors:
Expand Down Expand Up @@ -182,12 +184,21 @@ FlowNet will extract the data used to construct and condition the model from an
FlowNet has an option to generate separate FlowNet models for each layer. To initiate this, supply a list of lists containing the
start and end layer in the input simulation model for each distinct layer

database
~~~~~~~~~~

FlowNet will extract the production data used to history match the model from a CSV file.

* **input_data**: Path to the production data CSV file.

Example yaml section:

.. code-block:: yaml

flownet:
data_source:
database:
input_data: /path/to/production_data.csv
simulation:
input_case: /path/to/simulation_model.DATA
vectors:
Expand All @@ -204,7 +215,9 @@ Example yaml section:
In this example, the input simulation model (which has been simulated with Flow or Eclipse or similar) will be found in
*/path/to/simulation_model.DATA*, the vectors to use in the conditioning of the FlowNet model are *WOPR* and *WGPR*, each
with a relative error of 10% and minimum error of 50 (Sm3). Two FlowNet models will be created, one based on layers 1 to 5
in the input simulation model, and one based on layers 6 to 10 in the input simulation model.
in the input simulation model, and one based on layers 6 to 10 in the input simulation model. If no input database CSV file
containing production data is provided, FlowNet will use the simulated production data from the input simulation model.
If a CSV file is specified, the production data from the CSV file will be used.

resampling
~~~~~~~~~~
Expand Down
5 changes: 4 additions & 1 deletion src/flownet/ahm/_run_ahm.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
FaultTransmissibility,
Parameter,
)
from ..data import FlowData
from ..data import FlowData, CSVData


def _set_up_ahm_and_run_ert(
Expand Down Expand Up @@ -542,6 +542,9 @@ def run_flownet_history_matching(
layers=config.flownet.data_source.simulation.layers,
)
df_production_data: pd.DataFrame = field_data.production
if config.flownet.data_source.database.input_data:
csv_data = CSVData(config.flownet.data_source.database.input_data)
df_production_data = csv_data.production
df_well_connections: pd.DataFrame = field_data.get_well_connections(
config.flownet.perforation_handling_strategy
)
Expand Down
421 changes: 216 additions & 205 deletions src/flownet/config_parser/_config_parser.py

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions src/flownet/data/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from ..data import from_source

from ..data.from_flow import FromSource, FlowData
from ..data.from_csv import CSVData
64 changes: 64 additions & 0 deletions src/flownet/data/from_csv.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from pathlib import Path
from typing import Union

import pandas as pd


class CSVData:
"""
CSV data source class

Args:
input_data: Full path to CSV file to load production data from

"""

def __init__(
self,
input_data: Union[Path, str],
):
super().__init__()

self._input_data: Path = Path(input_data)

# pylint: disable=too-many-branches
def _production_data(self) -> pd.DataFrame:
"""
Function to read production data for all producers and injectors from a CSV file.

Returns:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the function will return the columns present in the csv file (if e.g. WGPR is not in the csv file it will not be returned with NaN's as I think it will in from_flow.py). I'm not sure if the code needs all vectors, so we may not need NaN's, but there should the docstring should be consistent with the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I understand the problem or how to solve it. Are you saying that we should check if all supported summary vectors are present in the CSV file? If so, is there a list of all supported summary vectors anywhere in the code?

A DataFrame with a DateTimeIndex and the following columns:
- date equal to index
- WELL_NAME Well name as used in Flow
- WOPR Well Oil Production Rate
- WGPR Well Gas Production Rate
- WWPR Well Water Production Rate
- WOPT Well Cumulative Oil Production
- WGPT Well Cumulative Gas Production
- WWPT Well Cumulative Water Production
- WBHP Well Bottom Hole Pressure
- WTHP Well Tubing Head Pressure
- WGIR Well Gas Injection Rate
- WWIR Well Water Injection Rate
- WSPR Well Salt Production Rate
- WSIR Well Salt Injection Rate
- WSPT Well Cumulative Salt Production
- WSIT Well Cumulative Salt Injection
- WSTAT Well status (OPEN, SHUT, STOP)
- TYPE Well Type: "OP", "GP", "WI", "GI"
- PHASE Main producing/injecting phase fluid: "OIL", "GAS", "WATER"

Todo:
* Remove depreciation warning suppression when solved in LibEcl.
* Improve robustness pf setting of Phase and Type.

"""
df_production_data = pd.read_csv(self._input_data)
df_production_data["date"] = pd.to_datetime(df_production_data["date"]).dt.date
df_production_data = df_production_data.set_index("date", drop=False)
return df_production_data

@property
def production(self) -> pd.DataFrame:
"""dataframe with all production data"""
return self._production_data()
2 changes: 1 addition & 1 deletion src/flownet/ert/_create_ert_setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def create_observation_file(
{
"dates": dates,
"schedule": schedule,
"error_config": config.flownet.data_source.simulation.vectors,
"error_config": config.flownet.data_source.vectors,
"num_beginning_date": setting[1],
"num_end_date": setting[2],
"last_training_date": dates[num_training_dates - 1],
Expand Down
105 changes: 50 additions & 55 deletions tests/test_check_obsfiles_ert_yaml.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,108 +189,103 @@ def test_check_obsfiles_ert_yaml() -> None:
# pylint: disable=maybe-no-member
config = collections.namedtuple("configuration", "flownet")
config.flownet = collections.namedtuple("flownet", "data_source")
config.flownet.data_source = collections.namedtuple("data_source", "simulation")
config.flownet.data_source.simulation = collections.namedtuple(
"simulation", "vectors"
)
config.flownet.data_source.simulation.vectors = collections.namedtuple(
"vectors", "WTHP"
)
config.flownet.data_source.simulation.vectors.WOPR = collections.namedtuple(
config.flownet.data_source = collections.namedtuple("data_source", "vectors")
config.flownet.data_source.vectors = collections.namedtuple("vectors", "WTHP")
config.flownet.data_source.vectors.WOPR = collections.namedtuple(
"WOPR", "min_error"
)
config.flownet.data_source.simulation.vectors.WOPR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WOPR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WOPR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WOPR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WGPR = collections.namedtuple(
config.flownet.data_source.vectors.WGPR = collections.namedtuple(
"WGPR", "min_error"
)
config.flownet.data_source.simulation.vectors.WGPR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WGPR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WGPR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WGPR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WWPR = collections.namedtuple(
config.flownet.data_source.vectors.WWPR = collections.namedtuple(
"WWPR", "min_error"
)
config.flownet.data_source.simulation.vectors.WWPR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WWPR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WWPR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WWPR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WOPT = collections.namedtuple(
config.flownet.data_source.vectors.WOPT = collections.namedtuple(
"WOPT", "min_error"
)
config.flownet.data_source.simulation.vectors.WOPT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WOPT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WOPT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WOPT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WGPT = collections.namedtuple(
config.flownet.data_source.vectors.WGPT = collections.namedtuple(
"WGPT", "min_error"
)
config.flownet.data_source.simulation.vectors.WGPT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WGPT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WGPT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WGPT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WWPT = collections.namedtuple(
config.flownet.data_source.vectors.WWPT = collections.namedtuple(
"WWPT", "min_error"
)
config.flownet.data_source.simulation.vectors.WWPT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WWPT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WWPT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WWPT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WBHP = collections.namedtuple(
config.flownet.data_source.vectors.WBHP = collections.namedtuple(
"WBHP", "min_error"
)
config.flownet.data_source.simulation.vectors.WBHP.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WBHP.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WBHP.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WBHP.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WTHP = collections.namedtuple(
config.flownet.data_source.vectors.WTHP = collections.namedtuple(
"WTHP", "min_error"
)
config.flownet.data_source.simulation.vectors.WTHP.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WTHP.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WTHP.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WTHP.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WGIR = collections.namedtuple(
config.flownet.data_source.vectors.WGIR = collections.namedtuple(
"WGIR", "min_error"
)
config.flownet.data_source.simulation.vectors.WGIR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WGIR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WGIR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WGIR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WWIR = collections.namedtuple(
config.flownet.data_source.vectors.WWIR = collections.namedtuple(
"WWIR", "min_error"
)
config.flownet.data_source.simulation.vectors.WWIR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WWIR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WWIR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WWIR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WGIT = collections.namedtuple(
config.flownet.data_source.vectors.WGIT = collections.namedtuple(
"WGIT", "min_error"
)
config.flownet.data_source.simulation.vectors.WGIT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WGIT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WGIT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WGIT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WWIT = collections.namedtuple(
config.flownet.data_source.vectors.WWIT = collections.namedtuple(
"WWIT", "min_error"
)
config.flownet.data_source.simulation.vectors.WWIT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WWIT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WWIT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WWIT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WSPR = collections.namedtuple(
config.flownet.data_source.vectors.WSPR = collections.namedtuple(
"WSPR", "min_error"
)
config.flownet.data_source.simulation.vectors.WSPR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WSPR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WSPR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WSPR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WSPT = collections.namedtuple(
config.flownet.data_source.vectors.WSPT = collections.namedtuple(
"WSPT", "min_error"
)
config.flownet.data_source.simulation.vectors.WSPT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WSPT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WSPT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WSPT.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WSIR = collections.namedtuple(
config.flownet.data_source.vectors.WSIR = collections.namedtuple(
"WSIR", "min_error"
)
config.flownet.data_source.simulation.vectors.WSIR.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WSIR.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WSIR.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WSIR.rel_error = _REL_ERROR

config.flownet.data_source.simulation.vectors.WSIT = collections.namedtuple(
config.flownet.data_source.vectors.WSIT = collections.namedtuple(
"WSIT", "min_error"
)
config.flownet.data_source.simulation.vectors.WSIT.min_error = _MIN_ERROR
config.flownet.data_source.simulation.vectors.WSIT.rel_error = _REL_ERROR
config.flownet.data_source.vectors.WSIT.min_error = _MIN_ERROR
config.flownet.data_source.vectors.WSIT.rel_error = _REL_ERROR

config.flownet.data_source.resampling = _RESAMPLING

Expand Down
3 changes: 2 additions & 1 deletion tests/test_probability_distributions.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from typing import List

import numpy as np
import pandas as pd


Expand Down Expand Up @@ -132,7 +133,7 @@

DISTRIBUTION_DF = pd.DataFrame(DATA)
# NaNs to None
DISTRIBUTION_DF = DISTRIBUTION_DF.where(DISTRIBUTION_DF.notnull(), None)
DISTRIBUTION_DF = DISTRIBUTION_DF.replace({np.nan: None})


def test_probability_distributions() -> None:
Expand Down