Skip to content

Commit

Permalink
offline mode bug fix
Browse files Browse the repository at this point in the history
  • Loading branch information
jstzwj committed Jul 18, 2024
1 parent 5c9e390 commit c382c33
Show file tree
Hide file tree
Showing 9 changed files with 286 additions and 142 deletions.
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# olah
Olah is self-hosted lightweight huggingface mirror service. `Olah` means `hello` in Hilichurlian.
<h1 align="center">Olah</h1>

<p align="center">
<b>Self-hosted Lightweight Huggingface Mirror Service</b>

Olah is a self-hosted lightweight huggingface mirror service. `Olah` means `hello` in Hilichurlian.
Olah implemented the `mirroring` feature for huggingface resources, rather than just a simple `reverse proxy`.
Olah does not immediately mirror the entire huggingface website but mirrors the resources at the file block level when users download them (or we can say cache them).

Expand Down
16 changes: 15 additions & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# olah
<h1 align="center">Olah</h1>


<p align="center">
<b>自托管的轻量级HuggingFace镜像服务</b>

Olah是一种自托管的轻量级HuggingFace镜像服务。`Olah`来源于丘丘人语,在丘丘人语中意味着`你好`
Olah真正地实现了huggingface资源的`镜像`功能,而不仅仅是一个简单的`反向代理`
Olah并不会立刻对huggingface全站进行镜像,而是在用户下载的同时在文件块级别对资源进行镜像(或者我们可以说是缓存)。
Expand Down Expand Up @@ -106,6 +111,15 @@ python -m olah.server --host localhost --port 8090 --repos-path ./hf_mirrors

**注意,不同版本之间的缓存数据不能迁移,请删除缓存文件夹后再进行olah的升级**

## 更多配置

更多配置可以通过配置文件进行控制,通过命令参数传入`configs.toml`以设置配置文件路径:
```bash
python -m olah.server -c configs.toml
```

完整的配置文件内容见[assets/full_configs.toml](https://github.com/vtuber-plan/olah/blob/main/assets/full_configs.toml)

## 许可证

olah采用MIT许可证发布。
Expand Down
38 changes: 28 additions & 10 deletions olah/files.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,16 +176,19 @@ async def _file_chunk_head(
allow_cache: bool,
file_size: int,
):
async with client.stream(
method=method,
url=url,
headers=headers,
timeout=WORKER_API_TIMEOUT,
) as response:
async for raw_chunk in response.aiter_raw():
if not raw_chunk:
continue
yield raw_chunk
if not app.app_settings.config.offline:
async with client.stream(
method=method,
url=url,
headers=headers,
timeout=WORKER_API_TIMEOUT,
) as response:
async for raw_chunk in response.aiter_raw():
if not raw_chunk:
continue
yield raw_chunk
else:
yield b""


async def _file_realtime_stream(
Expand All @@ -206,6 +209,7 @@ async def _file_realtime_stream(
else:
hf_url = url

# Handle Redirection
if not app.app_settings.config.offline:
async with httpx.AsyncClient() as client:
response = await client.request(
Expand All @@ -222,11 +226,25 @@ async def _file_realtime_stream(
if len(parsed_url.netloc) != 0:
new_loc = urljoin(app.app_settings.config.mirror_lfs_url_base(), get_url_tail(response.headers["location"]))
new_headers["location"] = new_loc

if allow_cache:
with open(head_path, "w", encoding="utf-8") as f:
f.write(json.dumps(new_headers, ensure_ascii=False))

yield response.status_code
yield new_headers
yield response.content
return
else:
if os.path.exists(head_path):
with open(head_path, "r", encoding="utf-8") as f:
head_content = json.loads(f.read())

if "location" in head_content:
yield 302
yield head_content
yield b""
return

async with httpx.AsyncClient() as client:
# redirect_loc = await _get_redirected_url(client, method, url, request_headers)
Expand Down
Loading

0 comments on commit c382c33

Please sign in to comment.