diff --git a/README.md b/README.md
index e744df2..84e579f 100644
--- a/README.md
+++ b/README.md
@@ -8,6 +8,7 @@ paperless-ngx-postprocessor allows you to automatically set titles, ASNs, and cr
 * Setup rulesets to choose which documents are postprocessed and which are ignored, based on metadata like correspondent, document_type, storage_path, tags, and more
 * For each ruleset, extract metadata using [Python regular expressions](https://docs.python.org/3/library/re.html#regular-expression-syntax)
 * Use [Jinja templates](https://jinja.palletsprojects.com/en/3.1.x/templates/) to specify new values for archive serial number, title, and created date, using the values from your regular expression
+* Optionally use [Jinja templates](https://jinja.palletsprojects.com/en/3.1.x/templates/) to validate document metadata, and add a tag to documents that have invalid metadata (e.g. to catch parsing errors)
 * Optionally apply a tag to documents that are changed during postprocessing, so you can keep track of which documents have changed
 * Optionally make backups of changes, so you can restore document metadata back to the way it was before postprocessing
 * Optionally run on one or more existing documents, if you need to adjust the metadata of documents that have already been consumed by Paperless-ngx
@@ -65,6 +66,8 @@ Last but not least, create rulesets in the `paperless-postprocessor-ngx/rulesets
 
 paperless-ngx-postprocessor works by reading rulesets from all the `.yml` files in the `rulesets.d` folder, seeing if the contents of the document match any of the rulesets, extracting values from the document's contents using a regular expression, and then writing new values for the metadata based on the document's preexisting metadata and any values extracted using the regular expression.
 
+You can also provide an optional validation rule to catch documents whose metadata doesn't get set properly.
+
 ### An example
 
 An example helps illustrate this. Say you have the following ruleset:
@@ -75,6 +78,7 @@ Some Ruleset Name:
   metadata_postprocessing:
     source: '{{ source | title }}' # This applies the Jinja 'title' filter, capitalizing each word
     title: '{{created_year}}-{{created_month}}-{{created_day}} -- {{correspondent}} -- {{document_type}} (from {{ source }})'
+  validation_rule: '{{ created_date_object == last_date_object_of_month(created_date_object) }}'
 ```
 
 First paperless-ngx-postprocessor will get a local copy of the document's preexisting metadata. For a full list of the preexisting metadata you can use for matching and postprocessing, see [below](#available-metadata).
@@ -94,9 +98,11 @@ Finally after all the rules are processed, paperless-ngx-postprocessor will take
 
 If any of those differ from the values the document's metadata had when we started, then paperless-ngx-postprocessor will push the new values to paperless-ngx, and processing is complete.
 
+After all of those values have been pushed, paperless-ngx-postprocessor will then try to evaluate the `validation_rule` field. In this case, the validation rule evaluates to `True` if the document's created date is the last day of the month.
+
 ### Some caveats
 
-In order to make parsing dates easier, paperless-postprocessor-ngx will "normalize" and error-check the `created_year`, `created_month`, and `created_day` fields after the initial values are extracted using the regular expression, and after every individual postprocessing rule.
+In order to make parsing dates easier, paperless-ngx-postprocessor will "normalize" and error-check the `created_year`, `created_month`, and `created_day` fields after the initial values are extracted using the regular expression, and after every individual postprocessing rule.
 
 Normalization is as follows:
 * `created_day` will be turned into a zero-padded two-digit string (e.g. `09`).
@@ -117,6 +123,14 @@ In addition to the [default Jinja filters](https://jinja.palletsprojects.com/en/
   * Matches using `re.match()`. Only returns `True` or `False`. For details see the [official python documentation](https://docs.python.org/3/library/re.html#re.match).
 * `regex_sub(pattern, repl)`
   * Substitutes using `re.sub()`. For details see the [official python documentation](https://docs.python.org/3/library/re.html#re.sub).
+* `date(year, month, day)`
+  * Creates a [Python `date` object](https://docs.python.org/3/library/datetime.html#date-objects) for the given date. This allows easier date manipulation inside Jinja templates.
+* `timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)`
+  * Creates a [Python `timedelta` object](https://docs.python.org/3/library/datetime.html#timedelta-objects). This allows easier date manipulation inside Jinja templates.
+* `last_date_object_of_month(date_object)`
+  * Takes a Python `date` object, extracts its month, and returns a new `date` object that corresponds to the last day of that month.
+* `num_documents(**constraints)`
+  * Queries paperless to see how many documents satisfy all of the `constraints`. For more information see FIXME below.
 
 These can be used like this:
 ```
@@ -171,6 +185,86 @@ Each of the rules will match any and every document (since their `match` field i
    6. Since fields persist across rulesets, and `bar` was set in the `First Ruleset`, title will be set to `uppercase foo is YOU_FOUND_ME`.
    7. This title will then be used to finally update paperless-ngx.
 
+### The `num_documents()` filter
+
+The `num_documents()` filter is primarily intended for validation rules. It returns the number of documents that match *all* of the given constraints. Each of the constraints must be specified by keyword. Valid arguments are:
+* `correspondent` - The name of the correspondent
+* `document_type` - The name of the document type
+* `storage_path` - The name of the storage path
+* `asn` - The archive serial number
+* `title` - The title of the document
+* `added_year` - The added year (as an `int`)
+* `added_month` - The added month (as an `int`)
+* `added_day` - The added day (as an `int`)
+* `added_date_object` - The added date as a Python `date` object. This is essentially a quicker way than specifying all of `added_year`, `added_month`, and `added_day`.
+* `added_range` - Finds documents created within a given range. The value should be a tuple containing two `date` objects, e.g. `(start_date, end_date)`. If either date is `None`, then that side of the limit is ignored. The limits are exclusive, so `(date(2063,04,01), None)` will find documents created on or after April 2, 2063, and will not match any documents created on April 1.
+* `created_year` - The created year (as an `int`)
+* `created_month` - The created month (as an `int`)
+* `created_day` - The created day (as an `int`)
+* `created_date_object` - The created date as a Python `date` object. This is essentially a quicker way than specifying all of `created_year`, `created_month`, and `created_day`.
+* `created_range` - Finds documents created within a given range. The value should be a tuple containing two `date` objects, e.g. `(start_date, end_date)`. If either date is `None`, then that side of the limit is ignored. The limits are exclusive, so `(date(2063,04,01), None)` will find documents created on or after April 2, 2063, and will not match any documents created on April 1.
+
+Some examples will help explain how to use `num_documents()`.
+
+### Example validation rules
+
+Say you have documents whose creation dates should only be the end of the month (e.g. a bank statement). To catch documents whose creation date isn't the end of the month, you could use:
+```yaml
+validation_rule: "{{ created_date_object == last_date_object_of_month(created_date_object) }}"
+```
+
+Say you have documents that should only be created on Sundays. Then you could use [the Python `date` object's `weekday()` method](https://docs.python.org/3/library/datetime.html#datetime.date.weekday):
+```yaml
+validation_rule: "{{ created_date_object.weekday() == 6 }}"
+```
+
+Say you have documents that should be unique, i.e. only one of that document with a given correspondent, document type, storage path, etc. on a given day. You could use the `num_documents` custom Jinja filter:
+```yaml
+validation_rule: "{{ num_documents(correspondent=correspondent, document_type=document_type, storage_path=storage_path, created_date_object=created_date_object) == 1 }}"
+```
+(Note that you have to specify all of those selectors, since the `limit` filter looks at *all* documents, *not* just those that would otherwise match the current ruleset's `match` rule.)
+
+Or you can get even fancier: say you want at most one document from a particular correspondent in a given calendar week, starting on Sunday. Then we need an expression that will give us the Saturday before since the range for `created_range` is exclusive. This little one-liner does just that, using the Python `timedelta` object:
+```yaml
+{% set week_start = created_date_object - timedelta(days=(((created_date_object.weekday()+1) % 7) + 1)) %}
+
+And then the Sunday after is just 8 days later:
+```yaml
+{% set week_end = week_start + timedelta(days=8) %}
+```
+
+Putting it all together, we get a validation rule like:
+```
+validation_rule: >-
+  {% set week_start = created_date_object - timedelta(days=(((created_date_object.weekday()+1) % 7) + 1)) %}
+  {% set week_end = week_start + timedelta(days=8) %}
+  {{ num_documents(correspondent=correspondent, created_range=(week_start, week_end)) == 1}}
+```
+
+#### Exceptions
+
+Sometimes you'll want to exclude some documents from validation. To do so, you'll need to adjust the `match` rule to exclude them. It's recommended that you split up the processing and the validation, in that case. E.g. to ignore documents 123 and 456 when doing validation, this:
+```yaml
+Some rulename:
+  match: '{{ SOME_FILTER }}'
+  metadata_postprocessing:
+    some_var: '{{ SOME_POSTPROCESSING_RULE }}'
+  validation_rule: '{{ SOME_VALIDATION_RULE }}'
+```
+
+becomes this:
+```yaml
+Some rulename for postprocessing:
+  match: '{{ SOME_FILTER }}'
+  metadata_postprocessing:
+    some_var: '{{ SOME_POSTPROCESSING_RULE }}'
+---
+Some rulename for validation:
+  match: '{{ SOME_FILTER and document_id not in [123, 456] }}'
+  validation_rule: '{{ SOME_VALIDATION_RULE }}'
+```
+
+
 ## Formal ruleset definition
 
 ### Ruleset syntax
@@ -184,18 +278,21 @@ Ruleset Name:
     METADATA_FIELDNAME_1: METADATA_TEMPLATE_1
     ...
     METADATA_FIELDNAME_N: METADATA_TEMPLATE_N
+  validation_rule: VALIDATION_TEMPLATE
 ```
 where
 * `MATCH_TEMPLATE` is a Jinja template. If it evaluates to True, the ruleset will match and postprocessing will continue.
 * `metadata_regex` is optional. If specified,`REGEX` is a Python regular expression. Any named groups in `REGEX` will be saved and their values can be used in the postprocessing rules in this ruleset.
 * `metadata_postprocessing` is optional. If not specified, then paperless-ngx-postprocessor will update the document's metadata based only on the fields extract from the regular expression.
 * `METADATA_FIELDNAME_X` is the name of a metadata field to update, and `METADATA_TEMPLATE_X` is a Jinja template that will be evaluated using the metadata so far. You can have as many metadata fields as you like.
+* `validation_rule` is optional. If specified, paperless-ngx-postprocessor will evaluate the `VALIDATION_TEMPLATE` Jinja template. If it evaluates to `False` and the `INVALID_TAG` is set, then the `INVALID_TAG` will be added to the document. (If `validation_rule` is omitted, no validation check is done.)
 
 ### Available metadata:
 
 The metadata available for matching and postprocessing mostly matches [the metadata available in paperless-ngx for filename handling](https://paperless-ngx.readthedocs.io/en/latest/advanced_usage.html#file-name-handling).
 
 The following fields are read-only. They keep the same value through postprocessing as they had before postprocessing started. (If you try to overwrite them with new values, those values will be ignored.)
+* `document_id`: The document ID.
 * `correspondent`: The name of the correspondent, or `None`.
 * `document_type`: The name of the document type, or `None`.
 * `tag_list`: A list object containing the names of all tags assigned to the document.
@@ -204,6 +301,8 @@ The following fields are read-only. They keep the same value through postprocess
 * `added_year`: Year added only (as a `str`, not an `int`).
 * `added_month`: Month added only, number 01-12 (as a `str`, not an `int`).
 * `added_day`: Day added only, number 01-31 (as a `str`, not an `int`).
+* `added_date`: The date the document was added in `YYYY-MM-DD` format.
+* `added_date_object`: A Python [date object](https://docs.python.org/3/library/datetime.html#date-objects) for the date the document was added.
 
 The following fields are available for matching, and can be overwritten by values extracted from the regular expression (e.g. by using a named group with the field name) or by postprocessing rules.
 * `asn`: The archive serial number of the document, or `None`.
@@ -215,6 +314,7 @@ The following fields are available for matching, and can be overwritten by value
 The following fields are read-only, but will be updated automatically after every step by the values given in the `created_year`, `created_month`, and `created_day` fields.
 * `created`:  The full date (ISO format) the document was created.
 * `created_date`: The date the document was created in `YYYY-MM-DD` format.
+* `created_date_object`: A Python [date object](https://docs.python.org/3/library/datetime.html#date-objects) for the date the document was created.
 
 ## Configuration
 
@@ -224,6 +324,7 @@ paperless-ngx-postprocessor can be configured using the following environment va
 * `PNGX_POSTPROCESSOR_DRY_RUN=<bool>`: If set to `True`, paperless-ngx-postprocessor will not actually push any changes to paperless-ngx. (default: `False`)
 * `PNGX_POSTPROCESSOR_BACKUP=<bool or path>`: Backup file to write any changed values to. If no filename is given, one will be automatically generated based on the current date and time. If the path is a directory, the automatically generated file will be stored in that directory. (default: `False`)
 * `PNGX_POSTPROCESSOR_POSTPROCESSING_TAG=<tag name>`: A tag to apply if any changes are made during postprocessing. (default: `None`)
+* `PNGX_POSTPROCESSOR_INVALID_TAG=<tag name>`: A tag to apply if the document fails any validation rules. (default: `None`)
 * `PNGX_POSTPROCESSOR_RULESETS_DIR=<directory>`: The config directory (within the Docker container) containing the rulesets for postprocessing. (default: `/usr/src/paperless-ngx-postprocessor/rulesets.d`)
 * `PNGX_POSTPROCESSOR_PAPERLESS_API_URL=<url>`: The full URL to access the Paperless-ngx REST API (within the Docker container). (default: `http://localhost:8000/api`)
 * `PNGX_POSTPROCESSOR_PAPERLESS_SRC_DIR=<directory>`: The directory containing the source for the running instance of paperless-ngx (within the Docker container). If this is set incorrectly, postprocessor will not be able to automagically acquire the auth token. (default: `/usr/src/paperless/src`)
@@ -271,6 +372,8 @@ Note that to run the management script from the docker host, you need to provide
 ./paperlessngx_postprocessor.py --auth-token THE_AUTH_TOKEN [specific command here]
 ```
 
+You'll probably also need to specify other configuration options (like the rulesets dir and the api url), since paperless-ngx-postprocessor won't automatically read them from Paperless-ngx's `docker-compose.env` file.
+
 ### Running inside or outside the docker container
 
 Note that no matter where you run it, `paperlessngx_postprocessor.py` will try to use sensible defaults to figure out how to access the Paperless-ngx API. If you have a custom configuration, you may need to specify additional configuration options to `paperlessngx_postprocessor.py`. See [Configuration](#configuration) above for more details.
@@ -279,10 +382,10 @@ In terms of how the script works in management mode, it runs post-processing on
 
 For example to re-run postprocessing on all documents with `correspondent` `The Bank`, you would do the following (including the auth token if running this command from the Docker host):
 ```bash
-./paperlessngx_postprocessor.py [--auth-token THE_AUTH_TOKEN] correspondent "The Bank"
+./paperlessngx_postprocessor.py [--auth-token THE_AUTH_TOKEN] [OTHER OPTIONS] process --correspondent "The Bank"
 ```
 
-You can choose all documents of a particular `correspondent` or `document_type` or `storage_path`, all documents with a specific `tag`, or just all documents (using `all`), or a specific document using its `document_id`. Note that you cannot combine selectors on the command line: e.g it's not possible to select all documents that match both a given `document_type` and `tag` simultaneously on the command line.
+You can choose all documents of a particular `correspondent`, `document_type`, `storage_path`, `tag`, and many other selectors, by `document_id`, or even all documents. For details on how to specify documents, do `./paperlessngx_postprocessor.py process --help`. Note that As of version 2.0.0, you **can** combine selectors on the command line.
 
 The command line interface supports all of the same options that you can set via the environment variables listed in the [Configuration section above](#configuration). To see how to specify them, use the command line interface's built-in help:
 ```bash
@@ -313,10 +416,18 @@ To restore backup to undo changes, do:
 
 If you want to see what the restore will do, you can open up the backup file in a text editor. Inside is just a yaml document with all of the document IDs and what their fields should be restored to.
 
-### Upgrading
+## Upgrading
 
+### Upgrading `paperless-ngx`
 If you are running paperless-ngx in a Docker container, you will need to redo [setup step two](#2-run-the-one-time-setup-script-inside-the-paperless-ngx-docker-container) after any time you upgrade paperless-ngx.
 
+### Upgrading `paperless-ngx-postprocessor`
+In the directory where you checked out `paperless-ngx-postprocessor`, just do a `git pull`
+
+#### Upgrading from v1 to v2
+- Rulesets for v2 are a superset of those for v1, so no changes should be necessary.
+- The command line interface has undergone breaking changes, so if you had any scripts that ran the management script (outside of running the standard post-consumption script), they'll need to be updated.
+
 ## FAQ
 
 ### Will this work with paperless or paperless-ng?
diff --git a/paperlessngx_postprocessor.py b/paperlessngx_postprocessor.py
index be62eff..232bba2 100755
--- a/paperlessngx_postprocessor.py
+++ b/paperlessngx_postprocessor.py
@@ -4,88 +4,130 @@
 import logging
 import sys
 import yaml
+import os
 
 from paperlessngx_postprocessor import Config, PaperlessAPI, Postprocessor
 
 if __name__ == "__main__":
     logging.basicConfig(format="[%(asctime)s] [%(levelname)s] [%(module)s] %(message)s")#, level=logging.DEBUG)
 
-    config = Config()
+    config = Config(Config.general_options())
     
     arg_parser = argparse.ArgumentParser(description="Apply postprocessing to documents in Paperless-ngx",
                                          #formatter_class=argparse.ArgumentDefaultsHelpFormatter,
                                          epilog="See https://github.com/jgillula/paperless-ngx-postprocessor#readme for more information and detailed examples.")
     for option_name in config.options_spec.keys():
         arg_parser.add_argument("--" + option_name.replace("_","-"), **config.options_spec[option_name].argparse_args)
-    
-    selector_options = ["document_id", "correspondent", "document_type", "tag", "storage_path", "all", "restore"]
-    arg_parser.add_argument("selector", metavar="SELECTOR", type=str, choices=selector_options, help="Selector to specify which document(s) to postprocess (or that you want to restore from a backup file). Choose one of {{{}}}".format(", ".join(selector_options)))
-    arg_parser.add_argument("item_id_or_name", nargs='?', type=str, help="document_id or name of the correspondent/document_type/tag/storage_path of the documents to postprocess, or filename of the backup file to restore. Required for all selectors except 'all'.")
+
+        #    arg_parser.add_argument("--select", metavar=("ADDITIONAL_SELECTOR", "ITEM_NAME"), nargs=2, action="append", help="Additional optional selectors to apply to narrow the set of documents to apply postprocessing to. Ignored if SELECTOR is one of {all, document_id, restore}. ADDITIONAL_SELECTOR must be one of {correspondent, document_type, tag, storage_path}.")
+
+    subparsers = arg_parser.add_subparsers(dest="mode", title='Modes', help="Use 'process [ARGS]' to choose which documents to process, or 'restore FILENAME' to restore a backup file.")
+
+    process_subparser = subparsers.add_parser("process", usage=f"{os.path.basename(__file__)} [OPTIONS] process [SELECTORS]", description='Process documents where all the [SELECTORS] match (e.g. a collective "and"). At least one selector is required. If --all or --document-id is given, all the other selectors are ignored.')
+    selector_group = process_subparser.add_argument_group(title="SELECTORS")
+    selector_config = Config(Config.selector_options(), use_environment_variables=False)
+    for option_name in selector_config.options_spec.keys():
+        selector_group.add_argument("--" + option_name.replace("_","-"), **selector_config.options_spec[option_name].argparse_args)
+
+    restore_subparser = subparsers.add_parser("restore", usage=f"{os.path.basename(__file__)} [OPTIONS] restore FILENAME")
+    restore_subparser.add_argument("filename", metavar="FILENAME", type=str, help="Filename of the backup file to restore.")
 
     cli_options = vars(arg_parser.parse_args())
 
     config.update_options(cli_options)
+    selector_config.update_options(cli_options)
 
-    config["selector"] = cli_options["selector"]
-    config["item_id_or_name"] = cli_options["item_id_or_name"]
+    # config["selector"] = cli_options["selector"]
+    # config["item_id_or_name"] = cli_options["item_id_or_name"]
 
-    logging.getLogger().setLevel(config["verbose"])
-    logging.debug(f"Running {sys.argv[0]} with config {config}")
-    
-    if config["selector"] != "all" and config["item_id_or_name"] is None:
-        if config["selector"] == "restore":
-            logging.error(f"A filename is required to backup from.")
-        else:
-            logging.error(f"An item ID or name is required when postprocessing documents by {config['selector']}, but none was provided.")
+    config["mode"] = cli_options["mode"]
+    config["filename"] = cli_options.get("filename")
 
-    if config["selector"] == "restore" and config["backup"] is not None:
-        logging.critical("Can't restore and do a backup simultaneously. Please choose one or the other.")
+    logger = logging.getLogger("paperlessngx_postprocessor")
+    logger.setLevel(config["verbose"])
+    logger.debug(f"Running {sys.argv[0]} with config {config} and {selector_config}")
+
+    # if config["selector"] != "all" and config["item_id_or_name"] is None:
+    #     if config["selector"] == "restore":
+    #         logging.error(f"A filename is required to backup from.")
+    #     else:
+    #         logging.error(f"An item ID or name is required when postprocessing documents by {config['selector']}, but none was provided.")
+
+    if config["mode"] == "restore" and config["backup"] is not None:
+        logger.critical("Can't restore and do a backup simultaneously. Please choose one or the other.")
         sys.exit(1)
 
     if config["dry_run"]:
-        logging.info("Doing a dry run. No changes will be made.")
+        # Force at least info level, by choosing whichever level is lower, the given level or info (since more verbose is lower)
+        logger.setLevel(min(logging.getLevelName(config["verbose"]), logging.getLevelName("INFO")))
+        logger.info("Doing a dry run. No changes will be made.")
 
     api = PaperlessAPI(config["paperless_api_url"],
                        auth_token = config["auth_token"],
                        paperless_src_dir = config["paperless_src_dir"],
-                       logger=logging.getLogger())
+                       logger=logger)
     postprocessor = Postprocessor(api,
                                   config["rulesets_dir"],                                  
                                   postprocessing_tag = config["postprocessing_tag"],
+                                  invalid_tag = config["invalid_tag"],
                                   dry_run = config["dry_run"],
-                                  logger=logging.getLogger())
+                                  skip_validation = config["skip_validation"],
+                                  logger=logger)
     
     documents = []
-    if config["selector"] == "restore":
-        logging.info(f"Restoring backup from {config['item_id_or_name']}")
-        with open(config["item_id_or_name"], "r") as backup_file:
+    if config["mode"] == "restore":
+        logger.info(f"Restoring backup from {config['filename']}")
+        with open(config["filename"], "r") as backup_file:
             yaml_documents = list(yaml.safe_load_all(backup_file))
-            logging.info(f" Restoring {len(yaml_documents)} documents")
+            logger.info(f" Restoring {len(yaml_documents)} documents")
             for yaml_document in yaml_documents:
                 document_id = yaml_document['id']
                 yaml_document.pop("id")
                 current_document = api.get_document_by_id(document_id)
-                logging.info(f"Restoring document {document_id}")
+                logger.info(f"Restoring document {document_id}")
                 for key in yaml_document:
-                    logging.info(f" {key}: '{current_document.get(key)}' --> '{yaml_document[key]}'")
+                    logger.info(f" {key}: '{current_document.get(key)}' --> '{yaml_document[key]}'")
                 if not config["dry_run"]:
                     api.patch_document(document_id, yaml_document)
         sys.exit(0)
-    elif config["selector"] == "all":
-        documents = api.get_all_documents()
-        logging.info(f"Postprocessing all {len(documents)} documents")
-    elif config["selector"] == "document_id":
-        documents.append(api.get_document_by_id(config["item_id_or_name"]))
-    elif config["selector"] in ["correspondent", "document_type", "tag", "storage_path"]:
-        documents = api.get_documents_by_selector_name(config["selector"], config["item_id_or_name"])
+    elif config["mode"] == "process":
+        if selector_config["all"]:
+            documents = api.get_all_documents()
+            logger.info(f"Postprocessing all {len(documents)} documents")
+        elif not(any(selector_config.values())):
+            logger.error("No SELECTORS provided. Please specify at least one SELECTOR.")
+            sys.exit(1)
+        elif selector_config.get("document_id"):
+            documents.append(api.get_document_by_id(selector_config.get("document_id")))
+        else:
+            documents = api.get_documents_by_field_names(**selector_config.options())
+
+        # Filter out any null documents, and then warn if no documents are left
+        documents = list(filter(lambda doc: doc, documents))
         if len(documents) == 0:
-            logging.warning(f"No documents found with {config['selector']} \'{config['item_id_or_name']}\'")
+            logger.warning(f"No documents found")
+            sys.exit(0)
         else:
-            logging.info(f"Postprocessing {len(documents)} documents with {config['selector']} \'{config['item_id_or_name']}\'")
-    
-    backup_documents = postprocessor.postprocess(documents)
+            logger.info(f"Processing {len(documents)} documents.")
+        #     documents.append(api.get_
+
+        # elif config["selector"] == "all":
+        #     documents = api.get_all_documents()
+        #     logger.info(f"Postprocessing all {len(documents)} documents")
+        # elif config["selector"] == "document_id":
+        #     documents.append(api.get_document_by_id(config["item_id_or_name"]))
+        # elif config["selector"] in ["correspondent", "document_type", "tag", "storage_path"]:
+        #     fields = {config["selector"]: config["item_id_or_name"]}
+        #     documents = api.get_documents_by_field_names()
+        #     # documents = api.get_documents_by_selector_name(config["selector"], config["item_id_or_name"])
+        #     # if len(documents) == 0:
+        #     #     logger.warning(f"No documents found with {config['selector']} \'{config['item_id_or_name']}\'")
+        #     # else:
+        #     #     logger.info(f"Postprocessing {len(documents)} documents with {config['selector']} \'{config['item_id_or_name']}\'")
+
+        backup_documents = postprocessor.postprocess(documents)
 
-    if len(backup_documents) > 0 and config["backup"] is not None:
-        logging.debug(f"Writing backup to {config['backup']}")
-        with open(config["backup"], "w") as backup_file:
-            backup_file.write(yaml.dump_all(backup_documents))
+        if len(backup_documents) > 0 and config["backup"] is not None:
+            logger.debug(f"Writing backup to {config['backup']}")
+            with open(config["backup"], "w") as backup_file:
+                backup_file.write(yaml.dump_all(backup_documents))
diff --git a/paperlessngx_postprocessor/config.py b/paperlessngx_postprocessor/config.py
index 36173f0..5db3ca2 100644
--- a/paperlessngx_postprocessor/config.py
+++ b/paperlessngx_postprocessor/config.py
@@ -1,5 +1,6 @@
 import os
-from datetime import datetime
+import dateutil.parser
+from datetime import datetime, date
 from pathlib import Path
 
 class Config:
@@ -10,62 +11,180 @@ def __init__(self, default, argparse_args):
 
             if "help" in self.argparse_args:
                 self.argparse_args["help"] = self.argparse_args["help"].format(default = default)
-            
+
+    _default_backup_name = datetime.now().strftime("%Y-%m-%d--%H-%M-%S")+".backup"
+
+    def selector_options():
+        return {"document_id": Config.OptionSpec(None, {"metavar": "DOCUMENT_ID",
+                                                        "help": "Select a document by its DOCUMENT_ID"}),
+                "correspondent": Config.OptionSpec(None, {"metavar": "CORRESPONDENT_NAME",
+                                                          "type": str,
+                                                          "help": "Select documents by their CORRESPONDENT_NAME"}),
+                "document_type": Config.OptionSpec(None, {"metavar": "DOCUMENT_TYPE_NAME",
+                                                          "type": str,
+                                                          "help": "Select documents by their DOCUMENT_TYPE_NAME"}),
+                "tag": Config.OptionSpec(None, {"metavar": "TAG_NAME",
+                                                "type": str,
+                                                "help": "Select documents with tag TAG_NAME"}),
+                "storage_path": Config.OptionSpec(None, {"metavar": "STORAGE_PATH_NAME",
+                                                          "type": str,
+                                                          "help": "Select documents by their STORAGE_PATH_NAME"}),
+                "created_year": Config.OptionSpec(None, {"metavar": "YEAR",
+                                                         "type": int,
+                                                         "help": "Select documents created in YEAR."}),
+                "created_month": Config.OptionSpec(None, {"metavar": "MONTH",
+                                                         "type": int,
+                                                         "help": "Select documents created in MONTH."}),
+                "created_day": Config.OptionSpec(None, {"metavar": "DAY",
+                                                         "type": int,
+                                                         "help": "Select documents created in DAY."}),
+                "created_range": Config.OptionSpec(None, {"metavar": "DATE--DATE",
+                                                          "type": str,
+                                                          "help": "Select documents created in a given range (exclusive), where DATE is of the form YYYY-MM-DD. Example: To get all documents created in April of 2063, you would use '--created-range 2063-03-31--2063-05-01'. To only get documents created before or after a given date, use 'x' instead of date, e.g. 'x--2063-05-01'"}),
+                "created_year": Config.OptionSpec(None, {"metavar": "YEAR",
+                                                         "type": int,
+                                                         "help": "Select documents created in YEAR."}),
+                "added_month": Config.OptionSpec(None, {"metavar": "MONTH",
+                                                         "type": int,
+                                                         "help": "Select documents added in MONTH."}),
+                "added_day": Config.OptionSpec(None, {"metavar": "DAY",
+                                                         "type": int,
+                                                         "help": "Select documents added in DAY."}),
+                "added_range": Config.OptionSpec(None, {"metavar": "DATE--DATE",
+                                                          "type": str,
+                                                          "help": "Select documents added in a given range (exclusive), where DATE is of the form YYYY-MM-DD. Example: To get all documents added in April of 2063, you would use '--added-range 2063-03-31--2063-05-01'. To only get documents added before or after a given date, use 'x' instead of date, e.g. 'x--2063-05-01'"}),
+                "asn": Config.OptionSpec(None, {"metavar": "ASN",
+                                                "type": int,
+                                                "help": "Select document by its ASN"}),
+                "title": Config.OptionSpec(None, {"metavar": "TITLE",
+                                                "type": str,
+                                                "help": "Select document by its TITLE"}),
+                "all": Config.OptionSpec(False, {"action": "store_true",
+                                                 "help": "Select all documents. WARNING! If you have a lot of documents, this will take a long time."}),
+        }
     
-    def __init__(self):
-        self._default_backup_name = datetime.now().strftime("%Y-%m-%d--%H-%M-%S")+".backup"
-        
-        self.options_spec = {"auth_token": Config.OptionSpec(None, {"metavar": "AUTH_TOKEN",
-                                                                    "type": str,
-                                                                    "help": "The auth token to access the REST API of Paperless-ngx. If not specified, postprocessor will try to automagically get it from Paperless-ngx's database directly."}),
-                             "dry_run": Config.OptionSpec(False, {"action": "store_const",
-                                                                  "const": True,
-                                                                  "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
-                             #"dry_run": Config.OptionSpec(False, {"action": "store_true",
-                             #                                     "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
-                             "backup": Config.OptionSpec(None, {"nargs": '?',
-                                                                "type": str,
-                                                                "const": self._default_backup_name,
-                                                                "help": "Backup file to write any changed values to. If no filename is given, one will be automatically generated based on the current date and time. If the path is a directory, the automatically generated file will be stored in that directory. (default: {default})"}),
-                             "postprocessing_tag": Config.OptionSpec(None, {"metavar": "TAG",
-                                                                            "type": str,
-                                                                            "help": "A tag to apply if any changes are made during postprocessing. (default: {default})"}),
-                             "verbose": Config.OptionSpec("WARNING", {"type": str,
-                                                                      "choices": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
-                                                                      "help": "The verbosity level for logging. (default: {default})"}),
-                             "rulesets_dir": Config.OptionSpec("/usr/src/paperless-ngx-postprocessor/rulesets.d", {"metavar": "RULESETS_DIR",
-                                                                                                                   "type": str,
-                                                                                                                   "help": "The config directory containing the rulesets for postprocessing. (default: {default})"}),
-                             "paperless_api_url": Config.OptionSpec("http://localhost:8000/api", {"metavar": "API_URL",
-                                                                                                  "type": str,
-                                                                                                  "help": "The full URL to access the Paperless-ngx REST API. (default: {default})"}),
-                             "paperless_src_dir": Config.OptionSpec("/usr/src/paperless/src", {"metavar": "PAPERLESS_SRC_DIR",
-                                                                                               "type": str,
-                                                                                               "help": "The directory containing the source for the running instance of paperless. If this is set incorrectly, postprocessor will not be able to automagically acquire the AUTH_TOKEN. (default: {default})"}),
+    def general_options():
+        return {"auth_token": Config.OptionSpec(None, {"metavar": "AUTH_TOKEN",
+                                                       "type": str,
+                                                       "help": "The auth token to access the REST API of Paperless-ngx. If not specified, postprocessor will try to automagically get it from Paperless-ngx's database directly."}),
+                "dry_run": Config.OptionSpec(False, {"action": "store_const",
+                                                     "const": True,
+                                                     "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
+                "skip_validation": Config.OptionSpec(False, {"action": "store_const",
+                                                             "const": True,
+                                                             "help": "Don't process any validation rules. (default: {default})"}),
+                #"dry_run": Config.OptionSpec(False, {"action": "store_true",
+                #                                     "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
+                "backup": Config.OptionSpec(None, {"type": str,
+                                                   "metavar": "FILENAME",
+                                                   "help": "Backup file to write any changed values to. If the string DEFAULT is given, one will be automatically generated based on the current date and time. If the path is a directory, the automatically generated file will be stored in that directory. (default: YYYY-MM-DD--HH-MM-SS.backup)"}),
+                "postprocessing_tag": Config.OptionSpec(None, {"metavar": "TAG",
+                                                               "type": str,
+                                                               "help": "A tag to apply if any changes are made during postprocessing. (default: {default})"}),
+                "invalid_tag": Config.OptionSpec(None, {"metavar": "TAG",
+                                                        "type": str,
+                                                        "help": "A tag to apply if the resulting metadata doesn't satisfy any validation rules. (default: {default})"}),
+                "verbose": Config.OptionSpec("WARNING", {"type": str,
+                                                         "choices": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
+                                                         "help": "The verbosity level for logging. (default: {default})"}),
+                "rulesets_dir": Config.OptionSpec("/usr/src/paperless-ngx-postprocessor/rulesets.d", {"metavar": "RULESETS_DIR",
+                                                                                                      "type": str,
+                                                                                                      "help": "The config directory containing the rulesets for postprocessing. (default: {default})"}),
+                "paperless_api_url": Config.OptionSpec("http://localhost:8000/api", {"metavar": "API_URL",
+                                                                                     "type": str,
+                                                                                     "help": "The full URL to access the Paperless-ngx REST API. (default: {default})"}),
+                "paperless_src_dir": Config.OptionSpec("/usr/src/paperless/src", {"metavar": "PAPERLESS_SRC_DIR",
+                                                                                  "type": str,
+                                                                                  "help": "The directory containing the source for the running instance of paperless. If this is set incorrectly, postprocessor will not be able to automagically acquire the AUTH_TOKEN. (default: {default})"}),
         }
 
+    def __init__(self, options_spec, use_environment_variables = True):
+        #self._default_backup_name = datetime.now().strftime("%Y-%m-%d--%H-%M-%S")+".backup"
+
+        # self.options_spec = {"auth_token": Config.OptionSpec(None, {"metavar": "AUTH_TOKEN",
+        #                                                             "type": str,
+        #                                                             "help": "The auth token to access the REST API of Paperless-ngx. If not specified, postprocessor will try to automagically get it from Paperless-ngx's database directly."}),
+        #                      "dry_run": Config.OptionSpec(False, {"action": "store_const",
+        #                                                           "const": True,
+        #                                                           "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
+        #                      "skip_validation": Config.OptionSpec(False, {"action": "store_const",
+        #                                                               "const": True,
+        #                                                               "help": "Don't process any validation rules. (default: {default})"}),
+        #                      #"dry_run": Config.OptionSpec(False, {"action": "store_true",
+        #                      #                                     "help": "Don't actually make any changes, just print what would happen. Forces the verbosity level to be at least INFO. (default: {default})"}),
+        #                      "backup": Config.OptionSpec(None, {"nargs": '?',
+        #                                                         "type": str,
+        #                                                         "const": self._default_backup_name,
+        #                                                         "help": "Backup file to write any changed values to. If no filename is given, one will be automatically generated based on the current date and time. If the path is a directory, the automatically generated file will be stored in that directory. (default: {default})"}),
+        #                      "postprocessing_tag": Config.OptionSpec(None, {"metavar": "TAG",
+        #                                                                     "type": str,
+        #                                                                     "help": "A tag to apply if any changes are made during postprocessing. (default: {default})"}),
+        #                      "invalid_tag": Config.OptionSpec(None, {"metavar": "TAG",
+        #                                                              "type": str,
+        #                                                              "help": "A tag to apply if the resulting metadata doesn't satisfy any validation rules. (default: {default})"}),
+        #                      "verbose": Config.OptionSpec("WARNING", {"type": str,
+        #                                                               "choices": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
+        #                                                               "help": "The verbosity level for logging. (default: {default})"}),
+        #                      "rulesets_dir": Config.OptionSpec("/usr/src/paperless-ngx-postprocessor/rulesets.d", {"metavar": "RULESETS_DIR",
+        #                                                                                                            "type": str,
+        #                                                                                                            "help": "The config directory containing the rulesets for postprocessing. (default: {default})"}),
+        #                      "paperless_api_url": Config.OptionSpec("http://localhost:8000/api", {"metavar": "API_URL",
+        #                                                                                           "type": str,
+        #                                                                                           "help": "The full URL to access the Paperless-ngx REST API. (default: {default})"}),
+        #                      "paperless_src_dir": Config.OptionSpec("/usr/src/paperless/src", {"metavar": "PAPERLESS_SRC_DIR",
+        #                                                                                        "type": str,
+        #                                                                                        "help": "The directory containing the source for the running instance of paperless. If this is set incorrectly, postprocessor will not be able to automagically acquire the AUTH_TOKEN. (default: {default})"}),
+        # }
+
+        self.options_spec = options_spec
+
         self._options = {}
         for option_name in self.options_spec.keys():
             self._options[option_name] = self.options_spec[option_name].default
-            if os.environ.get("PNGX_POSTPROCESSOR_"+option_name.upper()) is not None:
+            if os.environ.get("PNGX_POSTPROCESSOR_"+option_name.upper()) is not None and use_environment_variables:
                 self._options[option_name] = os.environ.get("PNGX_POSTPROCESSOR_"+option_name.upper())
         self._fix_options()
         
     def _fix_options(self):
-        if isinstance(self._options["dry_run"], str):
+        if isinstance(self._options.get("dry_run"), str):
             if self._options["dry_run"].lower() in ["f", "false", "no"]:
                 self._options["dry_run"] = False
             elif self._options["dry_run"].lower() in ["t", "true", "yes"]:
                 self._options["dry_run"] = True
-        if isinstance(self._options["backup"], str):
-            if self._options["backup"].lower() in ["t", "true", "yes"]:
-                self._options["backup"] = self._default_backup_name
-            elif self._options["backup"].lower() in ["f", "false", "no"]:
-                self._options["backup"] = None
+        if isinstance(self._options.get("backup"), str):
+            if self._options["backup"].lower() == "default":
+                self._options["backup"] = Config._default_backup_name
             else:
                 backup_path = Path(self._options["backup"])
                 if backup_path.is_dir():
-                    self._options["backup"] = str(backup_path / Path(self._default_backup_name))
+                    self._options["backup"] = str(backup_path / Path(Config._default_backup_name))
+        if isinstance(self._options.get("created_range"), str):
+            dates = self._options.get("created_range").split("--")
+            if len(dates) == 2:
+                new_dates = []
+                for date_str in dates:
+                    try:
+                        datetime_obj = dateutil.parser.isoparse(date_str)
+                        new_dates.append(datetime_obj.date())
+                    except:
+                        new_dates.append(None)
+                self._options["created_range"] = new_dates
+            else:
+                self._options["created_range"] = None
+
+        if isinstance(self._options.get("added_range"), str):
+            dates = self._options.get("added_range").split("--")
+            if len(dates) == 2:
+                new_dates = []
+                for date_str in dates:
+                    try:
+                        datetime_obj = dateutil.parser.isoparse(date_str)
+                        new_dates.append(datetime_obj.date())
+                    except:
+                        new_dates.append(None)
+                self._options["added_range"] = new_dates
+            else:
+                self._options["added_range"] = None
 
     def __getitem__(self, index):
         return self._options[index]
@@ -75,7 +194,16 @@ def __setitem__(self, index, item):
 
     def __str__(self):
         return str(self._options)
-        
+
+    def get(self, index, default=None):
+        return self._options.get(index, default)
+
+    def values(self):
+        return self._options.values()
+
+    def options(self):
+        return self._options
+
     def update_options(self, new_options):
         for option_name in self.options_spec.keys():
             if option_name in new_options and new_options[option_name] is not None:
diff --git a/paperlessngx_postprocessor/paperless_api.py b/paperlessngx_postprocessor/paperless_api.py
index ff0a149..b09243d 100644
--- a/paperlessngx_postprocessor/paperless_api.py
+++ b/paperlessngx_postprocessor/paperless_api.py
@@ -2,6 +2,7 @@
 import logging
 import os
 import requests
+from datetime import date
 from pathlib import Path
 
 class PaperlessAPI:
@@ -22,6 +23,8 @@ def __init__(self, api_url, auth_token, paperless_src_dir, logger=None):
             logging.debug(f"Auth token {auth_token} acquired")
 
         self._auth_token = auth_token
+        self._cache = {}
+        self._cachable_types = ["correspondents", "document_types", "storage_paths", "tags"]
 
     def delete_document_by_id(self, document_id):
         item_type = "documents"
@@ -47,6 +50,11 @@ def _get_item_by_id(self, item_type, item_id):
         return {}
 
     def _get_list(self, item_type, query=None):
+        # If the given item type has been cached, return it
+        if item_type in self._cache and query is None:
+            self._logger.debug(f"Returning {item_type} list from cache")
+            return self._cache[item_type]
+
         items = []
         next_url = f"{self._api_url}/{item_type}/"
         if query is not None:
@@ -61,6 +69,9 @@ def _get_list(self, item_type, query=None):
             else:
                 next_url = None
             
+        if item_type in self._cachable_types:
+            self._cache[item_type] = items
+
         return items
 
     def get_item_id_by_name(self, item_type, item_name):
@@ -84,6 +95,54 @@ def get_documents_by_selector_name(self, selector, name):
             query = f"{selector}s__id={selector_id}"
         return self._get_list("documents", query) 
 
+    def get_documents_by_field_names(self, **fields):
+        allowed_fields = {"correspondent": "correspondent__name__iexact",
+                          "document_type": "document_type__name__iexact",
+                          "storage_path": "storage_path__name__iexact",
+                          "added_year": "added__year",
+                          "added_month": "added__month",
+                          "added_day": "added_day",
+                          "asn": "archive_serial_number",
+                          "title": "title__iexact",
+                          "created_year": "created__year",
+                          "created_month": "created__month",
+                          "created_day": "created__day",
+        }
+
+        queries = []
+        for key in allowed_fields.keys():
+            if key in fields.keys() and fields[key] is not None:
+                queries.append(f"{allowed_fields[key]}={fields[key]}")
+
+        if (isinstance(fields.get("added_range"), (tuple, list)) and
+            len(fields.get("added_range")) == 2):
+            if isinstance(fields["added_range"][0], date):
+                queries.append(f"added__date__gt={fields['added_range'][0].strftime('%F')}")
+            if isinstance(fields["added_range"][1], date):
+                queries.append(f"added__date__lt={fields['added_range'][1].strftime('%F')}")
+
+        if (isinstance(fields.get("created_range"), (tuple, list)) and
+            len(fields.get("created_range")) == 2):
+            if isinstance(fields["created_range"][0], date):
+                queries.append(f"created__date__gt={fields['created_range'][0].strftime('%F')}")
+            if isinstance(fields["created_range"][1], date):
+                queries.append(f"created__date__lt={fields['created_range'][1].strftime('%F')}")
+
+
+        if isinstance(fields.get("added_date_object"), date):
+            queries.append(f"added__year={fields['added_date_object'].year}&added__month={fields['added_date_object'].month}&added__day={fields['added_date_object'].day}")
+
+        if isinstance(fields.get("created_date_object"), date):
+            queries.append(f"created__year={fields['created_date_object'].year}&created__month={fields['created_date_object'].month}&created__day={fields['created_date_object'].day}")
+
+        query = "&".join(queries)
+        self._logger.debug(f"Running query '{query}'")
+        return self._get_list("documents", query)
+
+
+    # def get_documents_from_query(self, query):
+    #     return self._get_list("documents", query)
+
     def get_all_documents(self):
         return self._get_list("documents")
 
@@ -104,6 +163,7 @@ def get_tag_by_id(self, tag_id):
 
     def get_metadata_in_filename_format(self, metadata):
         new_metadata = {}
+        new_metadata["document_id"] = metadata["id"]
         new_metadata["correspondent"] = (self.get_correspondent_by_id(metadata["correspondent"])).get("name")
         new_metadata["document_type"] = (self.get_document_type_by_id(metadata["document_type"])).get("name")
         new_metadata["storage_path"] = (self.get_storage_path_by_id(metadata["storage_path"])).get("name")
@@ -115,16 +175,21 @@ def get_metadata_in_filename_format(self, metadata):
         new_metadata["created_year"] = f"{created_date.year:04d}"
         new_metadata["created_month"] = f"{created_date.month:02d}"
         new_metadata["created_day"] = f"{created_date.day:02d}"
+        new_metadata["created_date"] = created_date.strftime("%F") # %F means YYYY-MM-DD
+        new_metadata["created_date_object"] = created_date
         new_metadata["added"] = metadata["added"]
         added_date = dateutil.parser.isoparse(new_metadata["added"])
         new_metadata["added_year"] = f"{added_date.year:04d}"
         new_metadata["added_month"] = f"{added_date.month:02d}"
         new_metadata["added_day"] = f"{added_date.day:02d}"
+        new_metadata["added_date"] = added_date.strftime("%F")
+        new_metadata["added_date_object"] = added_date
         
         return new_metadata
 
     def get_metadata_from_filename_format(self, metadata_in_filename_format):
         result = {}
+        result["id"] = metadata_in_filename_format["document_id"]
         result["correspondent"] = self.get_item_id_by_name("correspondents", metadata_in_filename_format["correspondent"])
         result["document_type"] = self.get_item_id_by_name("document_types", metadata_in_filename_format["document_type"])
         result["storage_path"] = self.get_item_id_by_name("storage_paths", metadata_in_filename_format["storage_path"])
diff --git a/paperlessngx_postprocessor/postprocessor.py b/paperlessngx_postprocessor/postprocessor.py
index af610cf..735e5d9 100644
--- a/paperlessngx_postprocessor/postprocessor.py
+++ b/paperlessngx_postprocessor/postprocessor.py
@@ -4,28 +4,35 @@
 import logging
 import regex
 import yaml
-from datetime import datetime
+from datetime import date, datetime, timedelta
 from pathlib import Path
 
 from .paperless_api import PaperlessAPI
 
 class DocumentRuleProcessor:
-    def __init__(self, spec, logger = None):
+    def __init__(self, api, spec, logger = None):
         self._logger = logger
         if self._logger is None:
             logging.basicConfig(format="[%(asctime)s] [%(levelname)s] [%(module)s] %(message)s", level="CRITICAL")
             self._logger = logging.getLogger()
 
+        self._api = api
+
         self.name = list(spec.keys())[0]
         self._match = spec[self.name].get("match")
         self._metadata_regex = spec[self.name].get("metadata_regex")
         self._metadata_postprocessing = spec[self.name].get("metadata_postprocessing")
+        self._validation_rule = spec[self.name].get("validation_rule")
         #self._title_format = spec[self.name].get("title_format")
 
         self._env = jinja2.Environment()
         self._env.filters["expand_two_digit_year"] = self._expand_two_digit_year
         self._env.filters["regex_match"] = self._jinja_filter_regex_match
         self._env.filters["regex_sub"] = self._jinja_filter_regex_sub
+        self._env.globals["last_date_object_of_month"] = self._last_date_object_of_month
+        self._env.globals["num_documents"] = self._num_documents
+        self._env.globals["date"] = date
+        self._env.globals["timedelta"] = timedelta
 
     def matches(self, metadata):
         if type(self._match) is str:
@@ -65,6 +72,60 @@ def _expand_two_digit_year(self, year, prefix=None):
         else:
             return f"{year}"
 
+    def _last_date_object_of_month(self, date_object):
+        if isinstance(date_object, date):
+            return date(date_object.year, date_object.month, calendar.monthrange(date_object.year, date_object.month)[1])
+        return None
+
+    def _num_documents(self, **constraints):
+        # allowed_constraints = {"correspondent": "correspondent__name__iexact",
+        #                        "document_type": "document_type__name__iexact",
+        #                        "storage_path": "storage_path__name__iexact",
+        #                        "added_year": "added__year",
+        #                        "added_month": "added__month",
+        #                        "added_day": "added_day",
+        #                        "asn": "archive_serial_number",
+        #                        "title": "title__iexact",
+        #                        "created_year": "created__year",
+        #                        "created_month": "created__month",
+        #                        "created_day": "created__day",
+        # }
+
+        # queries = []
+        # for key in allowed_constraints.keys():
+        #     if key in constraints.keys():
+        #         queries.append(f"{allowed_constraints[key]}={constraints[key]}")
+
+        # if (isinstance(constraints.get("added_range"), (tuple, list)) and
+        #     len(constraints.get("added_range")) == 2):
+        #     if isinstance(constraints["added_range"][0], date):
+        #         queries.append(f"added__date__gt={constraints['added_range'][0].strftime('%F')}")
+        #     if isinstance(constraints["added_range"][1], date):
+        #         queries.append(f"added__date__lt={constraints['added_range'][1].strftime('%F')}")
+
+        # if (isinstance(constraints.get("created_range"), (tuple, list)) and
+        #     len(constraints.get("created_range")) == 2):
+        #     if isinstance(constraints["created_range"][0], date):
+        #         queries.append(f"created__date__gt={constraints['created_range'][0].strftime('%F')}")
+        #     if isinstance(constraints["created_range"][1], date):
+        #         queries.append(f"created__date__lt={constraints['created_range'][1].strftime('%F')}")
+
+
+        # if isinstance(constraints.get("added_date_object"), date):
+        #     queries.append(f"added__year={constraints['added_date_object'].year}&added__month={constraints['added_date_object'].month}&added__day={constraints['added_date_object'].day}")
+
+        # if isinstance(constraints.get("created_date_object"), date):
+        #     queries.append(f"created__year={constraints['created_date_object'].year}&created__month={constraints['created_date_object'].month}&created__day={constraints['created_date_object'].day}")
+
+        # query = "&".join(queries)
+        # self._logger.debug(f"Running query '{query}'")
+
+        #items = self._api.get_documents_from_query(query)
+        items = self._api.get_documents_by_field_names(**constraints)
+        self._logger.debug(f"Found {len(items)} documents matching the query")
+
+        return len(items)
+
     def _jinja_filter_regex_match(self, string, pattern):
         '''Custom jinja filter for regex matching'''
         if regex.match(pattern, string):
@@ -78,23 +139,39 @@ def _jinja_filter_regex_sub(self, string, pattern, repl):
     
     def _normalize_created_dates(self, new_metadata, old_metadata):
         result = new_metadata.copy()
-        #if "created_year" in metadata.keys():
         try:
             result["created_year"] = str(int(new_metadata["created_year"]))
         except:
             result["created_year"] = old_metadata["created_year"]
-        #if "created_month" in metadata.keys():
         result["created_month"] = self._normalize_month(new_metadata["created_month"], old_metadata["created_month"])
-        #if "created_day" in metadata.keys():
         result["created_day"] = self._normalize_day(new_metadata["created_day"], old_metadata["created_day"])
 
         original_created_date = dateutil.parser.isoparse(old_metadata["created"])
-        new_created_date = datetime(int(result["created_year"]), int(result["created_month"]), int(result["created_day"]), 12, tzinfo=original_created_date.tzinfo)
+        new_created_date = datetime(int(result["created_year"]), int(result["created_month"]), int(result["created_day"]), original_created_date.hour, tzinfo=original_created_date.tzinfo)
         result["created"] = new_created_date.isoformat()
         result["created_date"] = new_created_date.strftime("%F") # %F means YYYY-MM-DD
-        
+        result["created_date_object"] = date(int(result["created_year"]), int(result["created_month"]), int(result["created_day"]))
+
         return result
 
+    def validate(self, metadata):
+        valid = True
+
+        metadata = self._normalize_created_dates(metadata, metadata)
+
+        # Try to apply the validation rule
+        if self._validation_rule is not None:
+            self._logger.debug(f"Validating for rule {self.name} using metadata={metadata}")
+            template = self._env.from_string(self._validation_rule)
+            template_result = template.render(**metadata).strip()
+            self._logger.debug(f"Validation template rendered to '{template_result}'")
+            valid = (template_result != "False")
+            if not valid:
+                self._logger.warning(f"Failed validation rule '{self._validation_rule}'")
+        else:
+            self._logger.debug(f"No validation rule found for {self.name}")
+
+        return valid
         
     def get_new_metadata(self, metadata, content):
         read_only_metadata_keys = ["correspondent",
@@ -104,7 +181,8 @@ def get_new_metadata(self, metadata, content):
                                    "added",
                                    "added_year",
                                    "added_month",
-                                   "added_day"]        
+                                   "added_day",
+                                   "document_id"]
         read_only_metadata = {key: metadata[key] for key in read_only_metadata_keys if key in metadata}
         writable_metadata_keys = list(set(metadata.keys()) - set(read_only_metadata_keys))
         writable_metadata = {key: metadata[key] for key in writable_metadata_keys if key in metadata}
@@ -119,8 +197,8 @@ def get_new_metadata(self, metadata, content):
                 writable_metadata = self._normalize_created_dates(writable_metadata, metadata)
                 self._logger.debug(f"Regex results are {writable_metadata}")
             else:
-                self._logger.warning(f"Regex '{self._metadata_regex}' for '{self.name}' didn't match")
-                
+                self._logger.warning(f"Regex '{self._metadata_regex}' for '{self.name}' didn't match for document_id={metadata['document_id']}")
+
         # Cycle throguh the postprocessing rules
         if self._metadata_postprocessing is not None:
             for variable_name in self._metadata_postprocessing.keys():
@@ -128,7 +206,7 @@ def get_new_metadata(self, metadata, content):
                     old_value = writable_metadata.get(variable_name)
                     merged_metadata = {**writable_metadata, **read_only_metadata}
                     template = self._env.from_string(self._metadata_postprocessing[variable_name])
-                    writable_metadata[variable_name] = template.render(**merged_metadata)                    
+                    writable_metadata[variable_name] = template.render(**merged_metadata)
                     writable_metadata = self._normalize_created_dates(writable_metadata, metadata)
                     self._logger.debug(f"Updating '{variable_name}' using template {self._metadata_postprocessing[variable_name]} and metadata {merged_metadata}\n: '{old_value}'->'{writable_metadata[variable_name]}'")
                 except Exception as e:
@@ -142,7 +220,7 @@ def get_new_metadata(self, metadata, content):
 
 
 class Postprocessor:
-    def __init__(self, api, rules_dir, postprocessing_tag = None, dry_run = False, logger = None):
+    def __init__(self, api, rules_dir, postprocessing_tag = None, invalid_tag = None, dry_run = False, skip_validation = False, logger = None):
         self._logger = logger
         if self._logger is None:
             logging.basicConfig(format="[%(asctime)s] [%(levelname)s] [%(module)s] %(message)s", level="CRITICAL")
@@ -154,7 +232,15 @@ def __init__(self, api, rules_dir, postprocessing_tag = None, dry_run = False, l
             self._postprocessing_tag_id = self._api.get_item_id_by_name("tags", postprocessing_tag)
         else:
             self._postprocessing_tag_id = None
+
+        if invalid_tag is not None:
+            self._invalid_tag_id = self._api.get_item_id_by_name("tags", invalid_tag)
+        else:
+            self._invalid_tag_id = None
+
+
         self._dry_run = dry_run
+        self._skip_validation = skip_validation
 
         self._processors = []
     
@@ -164,7 +250,7 @@ def __init__(self, api, rules_dir, postprocessing_tag = None, dry_run = False, l
                     try:
                         yaml_documents = yaml.safe_load_all(yaml_file)
                         for yaml_document in yaml_documents:
-                            self._processors.append(DocumentRuleProcessor(yaml_document, self._logger))
+                            self._processors.append(DocumentRuleProcessor(self._api, yaml_document, self._logger))
                     except Exception as e:
                         self._logger.warning(f"Unable to parse yaml in {filename}: {e}")
         self._logger.debug(f"Loaded {len(self._processors)} rules")
@@ -183,9 +269,16 @@ def _get_new_metadata_in_filename_format(self, metadata_in_filename_format, cont
 
         return new_metadata
 
+    def _validate(self, metadata_in_filename_format):
+        for processor in self._processors:
+            if processor.matches(metadata_in_filename_format):
+                if not processor.validate(metadata_in_filename_format):
+                    return False
+        return True
 
     def postprocess(self, documents):
         backup_documents = []
+        num_invalid = 0
         for document in documents:
             metadata_in_filename_format = self._api.get_metadata_in_filename_format(document)
             self._logger.debug(f"metadata_in_filename_format={metadata_in_filename_format}")
@@ -212,7 +305,29 @@ def postprocess(self, documents):
                     self._logger.info(f"No changes for document_id={document['id']}")
             else:
                 self._logger.info(f"No changes for document_id={document['id']}")
-            
+
+            if (not self._skip_validation) and (self._invalid_tag_id is not None):
+                # Note that we have to refetch the document here to get the changes we just applied from postprocessing
+                metadata_in_filename_format = self._api.get_metadata_in_filename_format(self._api.get_document_by_id(document['id']))
+                metadata = self._api.get_metadata_from_filename_format(metadata_in_filename_format)
+                valid = self._validate(metadata_in_filename_format)
+                if not valid:
+                    num_invalid += 1
+                    metadata["tags"].append(self._invalid_tag_id)
+                    self._logger.warning(f"document_id={document['id']} is invalid, adding tag {self._invalid_tag_id}")
+                    if not self._dry_run:
+                        self._api.patch_document(document["id"], {"tags": metadata["tags"]})
+                        backup_data = {"tags": metadata["tags"]}
+                        backup_data["id"] = document["id"]
+                        backup_documents.append(backup_data)
+                else:
+                    self._logger.info(f"document_id={document['id']} is valid")
+            else:
+                self._logger.info(f"Validation was skipped since invalid_tag_id={self._invalid_tag_id} and skip_validation={self._skip_validation}")
+
+        if num_invalid > 0:
+            self._logger.warning(f"Found {num_invalid}/{len(documents)} invalid documents")
+
         return backup_documents
         
 #         # if "created_year" in regex_data.keys():
diff --git a/post_consume_cid_fixer.py b/post_consume_cid_fixer.py
index ec4cc8e..21dc221 100755
--- a/post_consume_cid_fixer.py
+++ b/post_consume_cid_fixer.py
@@ -12,7 +12,7 @@
 if __name__ == "__main__":
     document_id = os.environ["DOCUMENT_ID"]
 
-    config = Config()
+    config = Config(Config.general_options())
     logging.basicConfig(format="[%(asctime)s] [%(levelname)s] [%(module)s] %(message)s", level=config["verbose"])
 
     api = PaperlessAPI(config["paperless_api_url"],
diff --git a/post_consume_script.py b/post_consume_script.py
index ba18028..2db4cec 100755
--- a/post_consume_script.py
+++ b/post_consume_script.py
@@ -15,14 +15,15 @@
 
     if document_id is not None:
         subprocess.run((str(Path(directory)/"paperlessngx_postprocessor.py"),
-                        "document_id",
+                        "process",
+                        "--document-id",
                         document_id))
 
         post_consume_script = os.environ.get("PNGX_POSTPROCESSOR_POST_CONSUME_SCRIPT")
         if post_consume_script is not None:
             logging.basicConfig(format="[%(asctime)s] [%(levelname)s] [%(module)s] %(message)s")
             
-            config = Config()
+            config = Config(Config.general_options())
             
             logging.getLogger().setLevel(config["verbose"])
             
diff --git a/post_consume_title_change_detector.py b/post_consume_title_change_detector.py
index 85d57dd..aba3f16 100755
--- a/post_consume_title_change_detector.py
+++ b/post_consume_title_change_detector.py
@@ -19,7 +19,7 @@
 
     new_filename = Path(os.environ["DOCUMENT_SOURCE_PATH"]).name
     if old_filename != new_filename:
-        config = Config()
+        config = Config(Config.general_options())
         api = PaperlessAPI(config["paperless_api_url"],
                            auth_token = config["auth_token"],
                            paperless_src_dir = config["paperless_src_dir"])
diff --git a/rulesets.d/example.yml b/rulesets.d/example.yml
index ce4e3f8..22a251a 100644
--- a/rulesets.d/example.yml
+++ b/rulesets.d/example.yml
@@ -24,3 +24,4 @@ Parse creation date from filename:
     created_year:  '{{ title_old | regex_sub("^(?P<created_year>\d{4})-(?P<created_month>\d{2})-(?P<created_day>\d{2}) (?P<title>.*)$", "\g<created_year>") }}'
     created_month: '{{ title_old | regex_sub("^(?P<created_year>\d{4})-(?P<created_month>\d{2})-(?P<created_day>\d{2}) (?P<title>.*)$", "\g<created_month>") }}'
     created_day:   '{{ title_old | regex_sub("^(?P<created_year>\d{4})-(?P<created_month>\d{2})-(?P<created_day>\d{2}) (?P<title>.*)$", "\g<created_day>") }}'
+  validation_rule: '{{ num_documents(correspondent=correspondent, document_type=document_type, created_date_object=created_date_object) == 1 }}'