I don’t understand what venv is or why this would work better. Will this make the compatibility issues go away? I could also just create a virtual Ubuntu environment that’s fresh if that would be easier and try to give that environment access to my GPU but I don’t know if that would work.
- 2 Posts
- 29 Comments
because it’s encrypted by https and my VPN has a decent reputation. Yes, it’s possible that the VPN is secretly selling everything they can to big data, but if it were exposed then it would ruin their entire business.
That’s exactly what I am trying to do, I’m just not that sure how to do it. I have the hardware needed, I just need to set up a docker with PyTorch and then find a way to set up Gradio inside that and then add TrOCR from hugging face, and then I’m good. I just am not totally sure how to do that and it seems hard, and when I ask AI for advice, it often is like “just run the following” and it’s wrong, and I’m not skilled enough to know why.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Should I be trying to leave the USA before it's too late?
6·5 days agoI previously tried to commit suicide when I was younger and was involuntarily hospitalized. As a result I’m a prohibited person. I can’t legally buy a firearm. This makes it harder for me to protect myself if things get worse. I don’t know if voting in my liberal community is going to change anything.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·5 days agoI don’t trust big tech to not extract data and metadata and save it. Many companies get served with government requests to save data and keep it secret. Even if handwritingocr.com doesn’t have such an agreement, it could run on AWS and that has an agreement. I would much rather do this locally. Some of the writings are confidential. Handwritingocr.com says data is encrypted in transit and at rest, but it’s not open source and even if it were I can’t verify the server code.
also Tesseract is CPU only, right? It will be so slow.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·5 days agoSo try again… in a couple of hours…
Why would that make a difference? It’s a local model right?
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Should I be trying to leave the USA before it's too late?
71·5 days agoI’m pragmatic. If a Tornado is coming towards me, I don’t try to rush into the eye to fight it. I change things I can, flee the things I can’t.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agoTerminal error after running GPT code:
python3 trocr_pdf.py small.pdf output.txt Traceback (most recent call last): File "/home/user/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 479, in cached_files hf_hub_download( File "/home/user/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn return fn(*args, **kwargs) File "/home/user/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1007, in hf_hub_download return _hf_hub_download_to_cache_dir( File "/home/user/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1124, in _hf_hub_download_to_cache_dir os.makedirs(os.path.dirname(blob_path), exist_ok=True) File "/usr/lib/python3.10/os.py", line 215, in makedirs makedirs(head, exist_ok=exist_ok) File "/usr/lib/python3.10/os.py", line 225, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/home/user/.cache/huggingface/hub/models--microsoft--trocr-base-handwritten' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/user/Documents/trocr_pdf.py", line 39, in <module> main(pdf_path, out_path) File "/home/user/Documents/trocr_pdf.py", line 11, in main processor = TrOCRProcessor.from_pretrained(model_name) File "/home/user/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1394, in from_pretrained args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1453, in _get_arguments_from_pretrained args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs)) File "/home/user/.local/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 489, in from_pretrained raise initial_exception File "/home/user/.local/lib/python3.10/site-packages/transformers/models/auto/image_processing_auto.py", line 476, in from_pretrained config_dict, _ = ImageProcessingMixin.get_image_processor_dict( File "/home/user/.local/lib/python3.10/site-packages/transformers/image_processing_base.py", line 333, in get_image_processor_dict resolved_image_processor_files = [ File "/home/user/.local/lib/python3.10/site-packages/transformers/image_processing_base.py", line 337, in <listcomp> resolved_file := cached_file( File "/home/user/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 322, in cached_file file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs) File "/home/user/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 524, in cached_files raise OSError( OSError: PermissionError at /home/user/.cache/huggingface/hub/models--microsoft--trocr-base-handwritten when downloading microsoft/trocr-base-handwritten. Check cache directory permissions. Common causes: 1) another user is downloading the same model (please wait); 2) a previous download was canceled and the lock file needs manual removal.LLMs are so bad at code sometimes. This happens all the time time with LLMs and code for me, the code is unusable and it saves no time because it’s a rabbit hole leading to nowhere.
I also don’t know if this is the right approach to the problem. Any sort of GUI interface would be easier. This is also hundreds of pages of handwritten stuff I want to change to text.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agothat’s not for TrOCR, it’s just for OCR, which may not work for handwriting
I did try some of the GPT steps:
pip install --upgrade transformers pillow pdf2imagegetting some errors:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━ 3/4 [transformers] WARNING: The scripts transformers and transformers-cli are installed in '/home/user/.local/bin' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location. ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. mistral-common 1.5.2 requires pillow<11.0.0,>=10.3.0, but you have pillow 12.1.0 which is incompatible. moviepy 2.1.2 requires pillow<11.0,>=9.2.0, but you have pillow 12.1.0 which is incompatible.this is what GPT said to run, but it makes no sense because I don’t have TrOCR even downloaded or running at all.
Install packages: pip install --upgrade transformers pillow pdf2image Ensure poppler is installed: Ubuntu/Debian: sudo apt install -y poppler-utils macOS: brew install poppler Execute: python3 trocr_pdf.py input.pdf output.txtThat’s the script to save and run.
#!/usr/bin/env python3 import sys from pdf2image import convert_from_path from PIL import Image import torch from transformers import TrOCRProcessor, VisionEncoderDecoderModel def main(pdf_path, out_path="output.txt", dpi=300): device = "cuda" if torch.cuda.is_available() else "cpu" model_name = "microsoft/trocr-base-handwritten" processor = TrOCRProcessor.from_pretrained(model_name) model = VisionEncoderDecoderModel.from_pretrained(model_name).to(device) pages = convert_from_path(pdf_path, dpi=dpi) results = [] for i, page in enumerate(pages, 1): page = page.convert("RGB") # downscale if very large to avoid OOM max_dim = 1600 if max(page.width, page.height) > max_dim: scale = max_dim / max(page.width, page.height) page = page.resize((int(page.width*scale), int(page.height*scale)), Image.Resampling.LANCZOS) pixel_values = processor(images=page, return_tensors="pt").pixel_values.to(device) generated_ids = model.generate(pixel_values, max_length=512) text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] results.append(f"--- Page {i} ---\n{text.strip()}\n") with open(out_path, "w", encoding="utf-8") as f: f.write("\n".join(results)) print(f"Saved OCR text to {out_path}") if __name__ == "__main__": if len(sys.argv) < 2: print("Usage: python3 trocr_pdf.py input.pdf [output.txt]") sys.exit(1) pdf_path = sys.argv[1] out_path = sys.argv[2] if len(sys.argv) > 2 else "output.txt" main(pdf_path, out_path)
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agoI don’t remember exactly, but I have rocm 7.2 installed, and there was something I was trying to install inside pip for rocm and it just wouldn’t work, it was like 7.2 rocm wasn’t out or the link didn’t work. The LLM tried multiple suggestion and they all failed, then I gave up. When I said “inside” pip, I don’t know if that’s accurate. I am very knew to pip and am decent at linux and only know a small amount of coding and lack python familiarity.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Should I be trying to leave the USA before it's too late?
253·6 days agoLots of people laughing in Germany in the 30s at people fleeing weren’t laughing in the 40s.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agowhich community? i did look and didn’t see one.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
24·6 days agoI am happy to participate. I do not want my IP and ping data sold to data brokers to serve targeted ads, track me, and go to the police surveillance state. I’m sure you are a good citizen who always keeps location on and feels like life would be easier if everyone just complied while you proudly put ring cameras on every door. Not everyone is a tech bro neo-feudalist.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agoi got that a lot when i first started using linux-based distros. “why aren’t you typing man?” or “just type sudo rmdir /”
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
11·6 days agoGPT 5 gave me a lot of code that returned errors. I really need help with the specific terminal code or knowing if I am even approaching the problem right.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
21·6 days agoI am trying to run an OCR program for handwriting to process some large PDFs of old journals that are scanned into PDF. Doing it by hand will take a very long time. I have a amd gpu and have rocm installed. I tried to configure pip with rocm and failed. I was considering pulling a docker of PyTorth and then configuring gradio in it, then trying to get gradio to run TrOCR. I have never run gradio. I have “easier” LLM programs like LM Studio and Ollama but I don’t know if they can run TrOCR. There is AMD documentation on running OCR (https://rocm.docs.amd.com/projects/ai-developer-hub/en/latest/notebooks/inference/ocr_vllm.html) but it’s not clear if it works well with handwriting. TrOCR is just trained for handwriting. It’s also on huggingface, which i don’t know how to use that well.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Is there an alternative to Stack Overflow?
33·6 days agohell fucking no. I do not need some large corporate company taking a) my browser fingerprint b) my real IP address and ping time and c) selling them to every data broker that exists. Fuck no. Buying a residential IP costs less than a dollar. Big data does not need to associate my online activities with who I am IRL.
It’s also a terrible idea because a lot of my online protection is because no one know my initial ping time, so I would be giving big data information like “here’s this user with browser hash ########### and they have a ping time of X ms and origination IP of X.X.X.X.” so after that, even if I protected my origination IP later, they could guess who I am based on hops and ping time and browser fingerprint, because even privacy browsers have a browser fingerprint. Fuuuuuuuuuuuuuuck no.
someone@lemmy.todayOPto
Ask Lemmy@lemmy.world•Are protestors in Iran aware of Briar? Would using Briar to communicate via bluetooth be a viable option with the country's current jamming technology?
41·10 days agoI added a warning to the original post based on what you said.
someone@lemmy.todayto
Privacy@lemmy.ml•Facebook is forcing new users to use facial recognition
31·10 days agoMost of the public’s awareness of technology is so naive as to basically stick their heads in the sand because they want the convenience and ease and are willing to overlook evil. So they don’t care about supply chain evils, corruption that is embedded into the system, so they can scroll TikTok and watch netflix on their offtime, while using Klarma to pay the groceries. Real organization against organizers in which people aren’t being data mined would require some technological awareness. The masses just don’t have it.
They join political groups on Facebook then get data mined and classified into oblivion until a computer can process their views and give them a discounted ad-supported Netflix so they are less politically upset.
This is our world.
They constantly measure DomRect using javascript, which is a unique hardware-based metric that can be used to track individual users.
Imagine the cost of running duck.ai. What exactly is the revenue that it brings in?
Of course, if it were some honeypot, using DomRects to track users (and DomRect is not protected by Tor Browser or Mullvad Browser etc), well then it doesn’t really matter if it’s not bringing in much revenue since it’s value is in being a honeypot.
Yes, DomRect can be used legitimately in coding without tracking users… but why does ddg need to use this when they know that it CAN be used to track users and users have no way to audit the servers?
It’s really interesting they measure DomRect and not Canvas when privacy-aware users often block canvas fingerprinting but don’t block DomRect.
It’s sus