Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Augur NEW Release v0.86.0
# Augur NEW Release v0.86.1

Augur is primarily a data engineering tool that makes it possible for data scientists to gather open source software community data - less data carpentry for everyone else!
The primary way of looking at Augur data is through [8Knot](https://github.com/oss-aspen/8knot), a public instance of 8Knot is available [here](https://metrix.chaoss.io) - this is tied to a public instance of [Augur](https://ai.chaoss.io).
Expand All @@ -11,7 +11,7 @@ We follow the [First Timers Only](https://www.firsttimersonly.com/) philosophy o
## NEW RELEASE ALERT!
**If you want to jump right in, the updated docker, docker-compose and bare metal installation instructions are available [here](docs/new-install.md)**.

Augur is now releasing a dramatically improved new version to the ```main``` branch. It is also available [here](https://github.com/chaoss/augur/releases/tag/v0.86.0).
Augur is now releasing a dramatically improved new version to the ```main``` branch. It is also available [here](https://github.com/chaoss/augur/releases/tag/v0.86.1).


- The `main` branch is a stable version of our new architecture, which features:
Expand Down
11 changes: 10 additions & 1 deletion augur/application/db/lib.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import re
import time
import random
import logging
Expand All @@ -17,7 +18,7 @@

logger = logging.getLogger("db_lib")

def convert_type_of_value(config_dict, logger=None):

Check warning on line 21 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name) Raw Output: augur/application/db/lib.py:21:39: W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)


data_type = config_dict["type"]
Expand Down Expand Up @@ -196,7 +197,7 @@

try:
working_commits = fetchall_data_from_sql_text(query)
except:

Check warning on line 200 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0702: No exception type(s) specified (bare-except) Raw Output: augur/application/db/lib.py:200:4: W0702: No exception type(s) specified (bare-except)
working_commits = []

return working_commits
Expand All @@ -212,7 +213,7 @@

try:
missing_commit_hashes = fetchall_data_from_sql_text(fetch_missing_hashes_sql)
except:

Check warning on line 216 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0702: No exception type(s) specified (bare-except) Raw Output: augur/application/db/lib.py:216:4: W0702: No exception type(s) specified (bare-except)
missing_commit_hashes = []

return missing_commit_hashes
Expand All @@ -232,7 +233,7 @@
return session.query(CollectionStatus).filter(getattr(CollectionStatus,f"{collection_type}_status" ) == CollectionState.COLLECTING.value).count()


def facade_bulk_insert_commits(logger, records):

Check warning on line 236 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name) Raw Output: augur/application/db/lib.py:236:31: W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)

with get_session() as session:

Expand All @@ -243,6 +244,7 @@
)
session.commit()
except Exception as e:
session.rollback()

if len(records) > 1:
logger.error(f"Ran into issue when trying to insert commits \n Error: {e}")
Expand All @@ -257,7 +259,14 @@
commit_record = records[0]
#replace incomprehensible dates with epoch.
#2021-10-11 11:57:46 -0500
placeholder_date = "1970-01-01 00:00:15 -0500"

# placeholder_date = "1970-01-01 00:00:15 -0500"
placeholder_date = commit_record['author_timestamp']

# Reconstruct timezone portion of the date string to UTC
placeholder_date = re.split("[-+]", placeholder_date)
placeholder_date.pop()
placeholder_date = "-".join(placeholder_date) + "+0000"

#Check for improper utc timezone offset
#UTC timezone offset should be between -14:00 and +14:00
Expand All @@ -274,7 +283,7 @@
raise e


def batch_insert_contributors(logger, data: Union[List[dict], dict]) -> Optional[List[dict]]:

Check warning on line 286 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name) Raw Output: augur/application/db/lib.py:286:30: W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)

batch_size = 1000

Expand All @@ -285,7 +294,7 @@



def bulk_insert_dicts(logger, data: Union[List[dict], dict], table, natural_keys: List[str], return_columns: Optional[List[str]] = None, string_fields: Optional[List[str]] = None, on_conflict_update:bool = True) -> Optional[List[dict]]:

Check warning on line 297 in augur/application/db/lib.py

View workflow job for this annotation

GitHub Actions / runner / pylint

[pylint] reported by reviewdog 🐶 W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name) Raw Output: augur/application/db/lib.py:297:22: W0621: Redefining name 'logger' from outer scope (line 19) (redefined-outer-name)

if isinstance(data, list) is False:

Expand Down
21 changes: 16 additions & 5 deletions augur/tasks/github/util/github_data_access.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@

class RatelimitException(Exception):

def __init__(self, response, message="Github Rate limit exceeded") -> None:
def __init__(self, response, keys_used, message="Github Rate limit exceeded") -> None:

self.response = response

super().__init__(message)
super().__init__(f"{message}. Keys used: {keys_used}")

class UrlNotFoundException(Exception):
pass
Expand All @@ -29,6 +29,7 @@ def __init__(self, key_manager, logger: logging.Logger):
self.logger = logger
self.key_client = KeyClient("github_rest", logger)
self.key = None
self.expired_keys_for_request = []

def get_resource_count(self, url):

Expand Down Expand Up @@ -108,7 +109,8 @@ def make_request(self, url, method="GET", timeout=100):
response = client.request(method=method, url=url, headers=headers, timeout=timeout, follow_redirects=True)

if response.status_code in [403, 429]:
raise RatelimitException(response)
self.expired_keys_for_request.append(self.key)
raise RatelimitException(response, self.expired_keys_for_request[-5:])

if response.status_code == 404:
raise UrlNotFoundException(f"Could not find {url}")
Expand All @@ -120,7 +122,8 @@ def make_request(self, url, method="GET", timeout=100):

try:
if "X-RateLimit-Remaining" in response.headers and int(response.headers["X-RateLimit-Remaining"]) < GITHUB_RATELIMIT_REMAINING_CAP:
raise RatelimitException(response)
self.expired_keys_for_request.append(self.key)
raise RatelimitException(response, self.expired_keys_for_request[-5:])
except ValueError:
self.logger.warning(f"X-RateLimit-Remaining was not an integer. Value: {response.headers['X-RateLimit-Remaining']}")

Expand All @@ -147,12 +150,16 @@ def __make_request_with_retries(self, url, method="GET", timeout=100):
"""

try:
return self.make_request(url, method, timeout)
result = self.make_request(url, method, timeout)
self.expired_keys_for_request = []
return result
except RatelimitException as e:
self.__handle_github_ratelimit_response(e.response)
raise e
except NotAuthorizedException as e:
self.expired_keys_for_request = []
self.__handle_github_not_authorized_response()
raise e

def __handle_github_not_authorized_response(self):

Expand All @@ -162,6 +169,7 @@ def __handle_github_not_authorized_response(self):
def __handle_github_ratelimit_response(self, response):

headers = response.headers
previous_key = self.key

if "Retry-After" in headers:

Expand All @@ -184,6 +192,9 @@ def __handle_github_ratelimit_response(self, response):
else:
self.key = self.key_client.expire(self.key, time.time() + 60)

if previous_key == self.key:
self.logger.error(f"The same key was returned after a request to expire it was sent (key: {self.key[-5:]})")

def __add_query_params(self, url: str, additional_params: dict) -> str:
"""Add query params to a url.

Expand Down
2 changes: 1 addition & 1 deletion docker/backend/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ RUN go install github.com/ossf/scorecard/v5@v5.1.1 \
FROM python:3.11-slim-bullseye

LABEL maintainer="outdoors@acm.org"
LABEL version="0.86.0"
LABEL version="0.86.1"

ENV DEBIAN_FRONTEND=noninteractive
ENV PATH="/usr/bin/:/usr/local/bin:/usr/lib:${PATH}"
Expand Down
2 changes: 1 addition & 1 deletion docker/database/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
FROM postgres:16

LABEL maintainer="outdoors@acm.org"
LABEL version="0.86.0"
LABEL version="0.86.1"

ENV POSTGRES_DB "test"

Check warning on line 7 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/

Check warning on line 7 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/
ENV POSTGRES_USER "augur"

Check warning on line 8 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/

Check warning on line 8 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/
ENV POSTGRES_PASSWORD "augur"

Check warning on line 9 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ENV "POSTGRES_PASSWORD") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/

Check warning on line 9 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/

Check warning on line 9 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ENV "POSTGRES_PASSWORD") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/

Check warning on line 9 in docker/database/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (database)

Legacy key/value format with whitespace separator should not be used

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format More info: https://docs.docker.com/go/dockerfile/rule/legacy-key-value-format/

EXPOSE 5432

Expand Down
2 changes: 1 addition & 1 deletion docker/rabbitmq/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
FROM rabbitmq:3.12-management-alpine

LABEL maintainer="574/augur@simplelogin.com"
LABEL version="0.86.0"
LABEL version="0.86.1"

ARG RABBIT_MQ_DEFAULT_USER=augur
ARG RABBIT_MQ_DEFAULT_PASSWORD=password123

Check warning on line 7 in docker/rabbitmq/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (rabbitmq)

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ARG "RABBIT_MQ_DEFAULT_PASSWORD") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/

Check warning on line 7 in docker/rabbitmq/Dockerfile

View workflow job for this annotation

GitHub Actions / Build image (rabbitmq)

Sensitive data should not be used in the ARG or ENV commands

SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ARG "RABBIT_MQ_DEFAULT_PASSWORD") More info: https://docs.docker.com/go/dockerfile/rule/secrets-used-in-arg-or-env/
ARG RABBIT_MQ_DEFAULT_VHOST=augur_vhost

COPY --chown=rabbitmq:rabbitmq ./docker/rabbitmq/augur.conf /etc/rabbitmq/conf.d/
Expand Down
4 changes: 2 additions & 2 deletions metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@

__short_description__ = "Python 3 package for free/libre and open-source software community metrics, models & data collection"

__version__ = "0.86.0"
__release__ = "v0.86.0 (Pod People)"
__version__ = "0.86.1"
__release__ = "v0.86.1 (Pod People)"

__license__ = "MIT"
__copyright__ = "University of Missouri, University of Nebraska-Omaha, CHAOSS, Sean Goggins, Brian Warner & Augurlabs 2025"
Loading