Microsoft open-sourced a Python tool for converting files and office documents to Markdown
GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.
Python tool for converting files and office documents to Markdown. - microsoft/markitdownGitHub
like this
reshared this
No, it's not. But there's a general fear that spreading manifestos of terrorists could cause people to believe them. They did the same thing with Bin Laden's manifesto.
For the record, I oppose this, it's just that I can understand why this sort of thing is done.
Teens abandon X and Facebook as TikTok and WhatsApp gain momentum, report
Teens abandon X and Facebook as TikTok and WhatsApp gain momentum, report
Almost 1,400 teens aged 13 to 17 took part in the survey between September and October. The major takeaway, if not a particularly surprising one, is that...Rob Thubron (TechSpot)
like this
‘Sesame Street’ Hits the Market: HBO and Max Opt Not to Renew Deal For New Episodes
HBO Ends Sesame Street Deal, Show Selling to Other Streaming Services
The Warner Bros. Discovery streaming service Max will continue to license library episodes of the long-running children's show, but is shifting its programming strategy to focus to adult and family fare.Alex Weprin (The Hollywood Reporter)
Map of GitHub
Map of GitHub
This website shows a map of GitHub. Each dot is a project. Two dots within the same cluster are usually close to each other if multiple users frequently gave stars to both projectsanvaka.github.io
France’s credit rating cut by Moody’s just hours after new prime minister appointed
France’s credit rating cut by Moody’s just hours after new prime minister appointed
“Political fragmentation” in Paris means there is a “very low probability” of reducing the ballooning fiscal deficit, ratings agency says.Jones Hayden (POLITICO)
Quentin Tarantino Says ‘There’s Not a Payoff’ on TV Shows Like ‘Yellowstone’: When It’s Over ‘It’s Out of My Head. It’s Completely Gone’
Quentin Tarantino Watched 'Yellowstone' and Forgot Every Detail
Quentin Tarantino says the difference between TV and movies is television lacks a payoff. He watched "Yellowstone" and forgot about it all.Zack Sharf (Variety)
Wait, so is he talking specifically about Yellowstone and other shows with the same flaws as it? Or does he believe this is necessarily true of all TV?
Because yeah, I can think of plenty of shows that I liked at the time but feel more or less the same as what he’s saying. Pretty much the entire Arrowverse. The US House of Cards remake. A bunch of shows that I eventually stopped watching without ever consciously deciding I wanted to drop them can probably be put down to this. Vikings. Leaky Blinders.
But I can also think of plenty of shows that really strongly stick with me. The original House of Cards. Chuck. Avatar: The Alastair Airbender. And that’s just looking at strongly plot-driven dramas. For comedy, there’s just no film that does anything even remotely like the best sitcoms in terms of how it feels emotionally to watch. Things like Parks & Rec or Brooklyn Nine Nine.
Box Office: 'Kraven the Hunter' Bombs With Worst Start for Sony's Marvel Films
Box Office: 'Kraven the Hunter' Bombs With Worst Start for Sony's Marvel Films
Sony's "Kraven the Hunter," a superhero spinoff starring Aaron Taylor-Johnson as Spider-Man's notorious foe, launched behind already low expectations.Rebecca Rubin (Variety)
That was the plan.
Sony suffered a data breach in 2014, and all their emails leaked. In those emails, they were discussing plans to make a Sinister Six movie with Spider-Man having a major role.
like this
Samerna har rätten till jakt och fiske inom samebyarnas områden. Det är den åsikt som förs fram i Reformisternas landsbygdprogram. Den så kallade Girjasdomen som avgjordes i Högsta domstolen år 2020 ger Girjas sameby rätten till all jakt och fiske på statens mark inom samebyns område.
Lol. The most famous straight male porn star to ever live. Ron Jeremy. His dong is legendary.
Fun fact: He was a background extra in the original Ghostbusters movie, but went unnoticed for a couple decades until the wide-screen release came out on DVD. He was cropped out on the vhs/square tv's of the 80's and 90's, so no one knew he was there.
Jolani: Syria too 'exhausted' to enter new conflicts
Syrian rebel leader Ahmed al-Sharaa, better known as Abu Mohammed Jolani, said on Saturday that he is not interested in engaging in new conflicts, despite Israel's seizure of a buffer zone in the occupied Golan Heights and its air force pummelling Syrian military positions with air strikes.
In a post on Telegram, Jolani said that Israel's actions had "clearly crossed the disengagement line in Syria, which threatens a new unjustified escalation in the region".
But, he added that “the general exhaustion in Syria after years of war and conflict does not allow us to enter new conflicts”.
like this
You understand that they were at war for a long time before they managed to sweep Assad and his forces away, right?
The final sudden advance may have come virtually overnight, but they've been fighting since the Arab spring in 2011
Israel approves plans to expand settlements in occupied Golan Heights
Israel's government has approved a plan to expand settlements in the occupied Golan Heights, Prime Minister Benjamin Netanyahu's office said on Sunday.
The statement said that Netanyahu acted "in light of the war and the new front facing Syria", as well as a desire to double the population of the Golan Heights.
like this
I was reading an article that was quoting the settler leaders. They want to go all the way to the Euphrates River. That's half of Iraq.
These people are not dealing with the world in good faith.
Polismordet i Göteborg 1900. Poliserna Johan Fredrik Hedén och hans kollega med namnet Fredén hade avhyst Victor Jansson, född 18 september 1877 och hans bror Arvid Jansson som var 17 år gammal från cigarrhandlare Hindbergs butik på Ånäsvägen i Gamlestaden.
Disillusioned in the USA: A Young American's Escape to China on a Journey to Discover Perspective
- YouTube
Auf YouTube findest du die angesagtesten Videos und Tracks. Außerdem kannst du eigene Inhalte hochladen und mit Freunden oder gleich der ganzen Welt teilen.www.youtube.com
loathsome dongeater
in reply to ☆ Yσɠƚԋσʂ ☆ • • •This could be useful to me. A while ago I was trying to make something that take all unread posts from my feed reader, make an epub out of them and then put it behind an OPDS server.
I found converting HTML from RSS to first markdown and then compiling them to an epub the most reliable way to take out the unnecessary markup from the source HTML. I used pandoc for this.
like this
☆ Yσɠƚԋσʂ ☆ and haverholm like this.
☆ Yσɠƚԋσʂ ☆
in reply to loathsome dongeater • • •utopiah
in reply to loathsome dongeater • • •Please come back and share if it's done better or worst and if so along which dimensions. Quite curious to better understand the differences.
like this
☆ Yσɠƚԋσʂ ☆ likes this.
hexaflexagonbear [he/him]
in reply to ☆ Yσɠƚԋσʂ ☆ • • •Max-P
in reply to hexaflexagonbear [he/him] • • •~Not really. All the features of that tool are basic functions we've had before LibreOffice was still OpenOffice.~
~Since this converts to Markdown, it's inherently a very lossy conversion. What's hard to pull off is preserve the full formatting when converting to an odt or something.~
Someone pointed out it doesn't just convert word documents to Markdown, it can also transcribe and OCR, so I guess it does have some usefulness!
like this
haverholm likes this.
☆ Yσɠƚԋσʂ ☆ doesn't like this.
davel
in reply to Max-P • • •In your saying this isn’t useful, you’re making a lot of assumptions about how someone might want to use this.
utopiah
in reply to davel • • •sofficeworks as CLI, can be called from Python and has plenty of related tooling, e.g. pypi.org/project/unoserver/ so I agree, I'm confused at what's actually novel and better than that or even dedicated long lasting FLOSS projects like pandoc.haverholm likes this.
vort3
in reply to davel • • •django
in reply to Max-P • • •utopiah
in reply to django • • •Quite curious... does it actually do that and if so how? Because STT to get a plaintext file or subtitle (so with timing) has been available via e.g. Whisper quite efficiently for a while now. If this though does do more, e.g. structure (differentiating a title, list, etc) I'd like to learn how.
django
in reply to utopiah • • •There is nothing special going on. This whole project is just a bunch of python libraries coupled together to a cli tool.
It uses the package SpeechRecognition to connect to the google speech recognition api:
github.com/microsoft/markitdow…
Pretty uninteresting and a bit disappointing. Pandoc is a lot more interesting.
utopiah
in reply to django • • •recognize_googleand seems it's relying on github.com/Uberi/speech_recogn… which then seems to rely on github.com/Uberi/speech_recogn… so basically are they using an API, sending all the audio data to Google servers?django
in reply to utopiah • • •utopiah
in reply to django • • •davel
in reply to ☆ Yσɠƚԋσʂ ☆ • • •charles
in reply to davel • • •davel
in reply to charles • • •utopiah
in reply to ☆ Yσɠƚԋσʂ ☆ • • •FWIW if you are interested in such tooling consider also
sofficeandpandocwhich have (as far as I can tell) similar features but have been existing for years now and are not related to Microsoft.Edit: not related to Microsoft AND Google, seems the transcription aspect (which IMHO is still weird in that context but OK) is done via Google servers, cf lemmy.ml/post/23629310/1558686…
haverholm
in reply to utopiah • • •utopiah
in reply to haverholm • • •Thanks for the clarification but I'm a bit confused here, like audio transcription, STT, done by e.g. Whisper? If so what's the use case? When I think of Office documents audio transcription is not something I have in mind.
utopiah
in reply to utopiah • • •JackbyDev
in reply to utopiah • • •haverholm
in reply to utopiah • • •I'm not completely clear either on how Microsoft have implemented this previously. As I said, I didn't look very deep into the repository.
If these are indeed other Python projects they piled together, as others suggest, I'd be happy to hear what speech recognition library this might've built on.
☆ Yσɠƚԋσʂ ☆ likes this.