Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branching #95

Merged
merged 16 commits into from
Jul 27, 2024
25 changes: 23 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,25 @@
# Tarchia

Data As Code.

- Data Changes as Commits
- Branching for Development
- Automated Testing
- Merging and Deployment



<!---
- Schema changes in transaction commits
- Multi-branch
- Data should have a stale timeframe and a purge timeframe
- Native Masking capability
- Native Sampling capability
- Create/Maintain Triggers
- Expectations Checks on Commit
- Secrets/Protected Data Checks on Commit
--->

Tarchia is an Active Data Catalog.

Tarchia actively manages and catalogs data in real-time. Unlike traditional catalogs that serve merely as passive records, our Active Data Catalog is essential to the operational workflow, ensuring meta data is always up-to-date and readily accessible for system processes.
Expand Down Expand Up @@ -47,11 +67,12 @@ table/
~~~mermaid
flowchart TD
CATALOG --> COMMITS(Commit History)
CATALOG --> PERMS(Permissions)
CATALOG[(Catalog)] --> |Current| COMMIT(Commit)
CATALOG --> |Current| SCHEMA(Schema)
subgraph
COMMITS -..-> |Historical| COMMIT
COMMIT --> SCHEMA
COMMIT --> SCHEMA(Schema)
COMMIT --> ENCRYPTION(Encryption)
COMMIT --> MAN_LIST(Manifest/List)
end
MAN_LIST --> DATA(Data Files)
Expand Down
4 changes: 2 additions & 2 deletions cloudbuild.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@ steps:
"managed",
"--allow-unauthenticated",
"--timeout",
"300",
"60",
"--cpu",
"1",
"2",
"--memory",
"1Gi",
"--update-env-vars",
Expand Down
8 changes: 4 additions & 4 deletions main.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,13 @@
from uvicorn import run

from tarchia import __version__
from tarchia.middlewares import AuditMiddleware
from tarchia.middlewares import AuthorizationMiddleware
from tarchia.v1 import routes as v1_routes
from tarchia.api.middlewares import AuditMiddleware
from tarchia.api.middlewares import AuthorizationMiddleware
from tarchia.api.v1 import v1_router

application = FastAPI(title="Tarchia Metastore", version=__version__)

application.include_router(v1_routes.v1_router)
application.include_router(v1_router)
application.add_middleware(AuthorizationMiddleware)
application.add_middleware(AuditMiddleware)

Expand Down
2 changes: 1 addition & 1 deletion tarchia/__version__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__build__ = 122
__build__ = 130

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down
File renamed without changes.
72 changes: 72 additions & 0 deletions tarchia/actions/scanners/expectations/evaluate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import typing

from data_expectations import Expectations
from data_expectations.errors import ExpectationNotMetError
from data_expectations.errors import ExpectationNotUnderstoodError

ALL_EXPECTATIONS = Expectations.all_expectations()


def evaluate_record(
expectations: Expectations, record: dict, suppress_errors: bool = False
) -> bool:
"""
Test a single record against a defined set of expectations.

Args:
expectations: The Expectations instance.
record: The dictionary record to be tested.
all_expectations: The dictionary of all available expectations.
suppress_errors: Whether to suppress expectation errors and return False instead.

Returns:
True if all expectations are met, False otherwise.
"""
for expectation_definition in expectations.set_of_expectations:
# get the name of the expectation
expectation = expectation_definition.expectation

if expectation not in ALL_EXPECTATIONS:
raise ExpectationNotUnderstoodError(expectation=expectation)

base_config = {
"row": record,
"column": expectation_definition.column,
**expectation_definition.config,
}

if not ALL_EXPECTATIONS[expectation](**base_config):
if not suppress_errors:
raise ExpectationNotMetError(expectation, record)
return False # data failed to meet expectation

return True


def evaluate_list(
expectations: Expectations, dictset: typing.Iterable[dict], suppress_errors: bool = False
) -> bool:
"""
Evaluate a set of records against a defined set of Expectations.

Args:
expectations: The Expectations instance.
dictset: The iterable set of dictionary records to be tested.
suppress_errors: Whether to suppress expectation errors and return False for the entire set.

Returns:
True if all records meet all Expectations, False otherwise.
"""
return all(evaluate_record(expectations, record, suppress_errors) for record in dictset)
Loading
Loading