emergingtrajectories.knowledge ============================== .. py:module:: emergingtrajectories.knowledge .. autoapi-nested-parse:: Solutions for finding, extracting, storing, and reviisting knowledge. Classes ------- .. autoapisummary:: emergingtrajectories.knowledge.KnowledgeBaseFileCache Functions --------- .. autoapisummary:: emergingtrajectories.knowledge.statement_to_search_queries emergingtrajectories.knowledge.uri_to_local Module Contents --------------- .. py:function:: statement_to_search_queries(statement_id: int, client: emergingtrajectories.Client, openai_api_key: str, num_queries: int = 3) -> list[str] Given a specific statement ID, this will return a list of queries you can put into a search engine to get useful information. :param statement_id: The ID of the statement to get search queries for. :type statement_id: int :param client: The Emerging Trajectories API client. :type client: Client :param openai_api_key: The OpenAI API key. :type openai_api_key: str :param num_queries: The number of queries to return. Defaults to 3. :type num_queries: int, optional :returns: A list of search queries. :rtype: list[str] .. py:function:: uri_to_local(uri: str) -> str Convert a URI to a local file name. In this case, we typically will use an MD5 sum. :param uri: The URI to convert. :type uri: str :returns: The MD5 sum of the URI. :rtype: str .. py:class:: KnowledgeBaseFileCache(folder_path: str, cache_file: str = 'cache.json') The KnowledgeBaseFileCache is a simple file-based cache for web content and local files. The cache stores the original HTML, PDF, or TXT content and tracks when (if ever) an agent actually accessed the content. :param folder_path: The folder where the cache will be stored. :type folder_path: str :param cache_file: The name of the cache file. Defaults to "cache.json". :type cache_file: str, optional .. py:attribute:: root_path .. py:attribute:: root_parsed .. py:attribute:: root_original .. py:attribute:: cache_file .. py:attribute:: cache .. py:method:: save_state() -> None Saves the in-memory changes to the knowledge base to the JSON cache file. .. py:method:: load_cache() -> None Loads the cache from the cache file, or creates the relevant files and folders if one does not exist. .. py:method:: in_cache(uri: str) -> bool Checks if a URI is in the cache already. :param uri: The URI to check. :type uri: str :returns: True if the URI is in the cache, False otherwise. :rtype: bool .. py:method:: update_cache(uri: str, obtained_on: datetime.datetime, last_accessed: datetime.datetime) -> None Updates the cache file for a given URI, specifically when it was obtained and last accessed. :param uri: The URI to update. :type uri: str :param obtained_on: The date and time when the content was obtained. :type obtained_on: datetime :param last_accessed: The date and time when the content was last accessed. :type last_accessed: datetime .. py:method:: log_access(uri: str) -> None Saves the last accessed time and updates the accessed tracker for a given URI. :param uri: The URI to update. :type uri: str .. py:method:: get_unaccessed_content() -> list[str] Returns a list of URIs that have not been accessed by the agent. :returns: A list of URIs that have not been accessed by the agent. :rtype: list[str] .. py:method:: get(uri: str) -> str Returns the content for a given URI. If the content is not in the cache, it will be scraped and added to the cache. :param uri: The URI to get the content for. :type uri: str :returns: The content for the given URI. :rtype: str .. py:method:: add_content(content: str, uri: str = None) -> None Adds content to cache. :param content: The content to add to the cache. :type content: str :param uri: The URI to use for the content. Defaults to None, in which case an MD5 sum of the content will be used. :type uri: str, optional .. py:method:: add_content_from_file(filepath: str, uri: str = None) -> None Adds content from a text file to the cache. :param filepath: The path to the file to add to the cache. :type filepath: str :param uri: The URI to use for the content. Defaults to None, in which case an MD5 sum of the content will be used. :type uri: str, optional