Microsoft just released BitNet!

thickertoofan@lemm.ee · 6 days ago

yay!

thickertoofan@lemm.ee · edit-2 17 days ago

i’m not the smartest out there to explain it but it’s like …instead of floating point numbers as the weights, its just -1,0,1.

thickertoofan@lemm.ee · 17 days ago

it was, it’s just that they have officially released a 2B model trained for the BitNet architecture

thickertoofan@lemm.ee · 17 days ago

Microsoft just released BitNet!

thickertoofan@lemm.ee · 18 days ago

I’ve worked on this topic a lot, did it once last year and this year being the above update. Also, just pushed major update to the website for a cool thing: https://dcda-v2.vercel.app/ please check it out again! Well the thing is, I really don’t have the motivation to work on this because this requires a large community effort to gather a meaningful count of data, and also from ML perspective, is it worth the effort? Like you’d have to take in the complexity of the hindi language itself, suppose i train the model to include the maatras, still would a model be able to identify two characters side by side conjoined by the line with the maatras? I mean if someone convinces me that this kind of dataset would have VERY much value in terms of contribution to digitization of the language and its ecosystem, and if it proves to be extremely useful for future researchers, then sure I’m down to work on it. And the implementation I’m thinking of is really really easy to implement, and we would not have to sit for hours writing samples on our own. We can distribute the task to the crowd but my idea of data collection would be getting people in person to write a few letters on a piece of paper and using cv to crop them out from the marked rectangles. I’m dumbing down the explanation but yeah it would require CV and markers. I can even collect data from the web app itself but not many people would chip in. I’m not exceptionally famous or have a huge following where I can get thousands of inputs in a few days/weeks/months. With the network I have, it would maybe take years to get meaningful variety of data, and im talking about the base characters without maatras.

sorry for large rant but yeah, i’m really not motivated to work on this but I do have the idea/ plan. I’d love to hand the torch to some newcomer or an enthusiast in ML to do it or someone who’s more into it than me right now.

thickertoofan@lemm.ee · 18 days ago

thanks a lot! I think, not only the joint letters but the diacritics is so diverse, and it is a shame that we don’t have any dataset covering this language and it’s diacritic combinations. Honestly the possibilities are infinite and i don’t know how we can generalize a model for this. It is surely possible but i’m not as experienced in ML. I’d really like to get ideas on this. Talking about dataset, I think im gonna do something about diacritics included dataset in the future. I have plans but not the time to execute it to its fullest, and also that the response and impact is very less.

thickertoofan@lemm.ee · 18 days ago

dcdaML - devanagari character detection dataset training framework

thickertoofan@lemm.ee · 18 days ago

dcdaML - devanagari character detection dataset training framework

thickertoofan@lemm.ee · 28 days ago

Nice to know. Thanks.

thickertoofan@lemm.ee · 1 month ago

Same, I have an HDD from 2012 which has my childhood memories. First thing I’m gonna do is to get it fixed from a reputed service when I start earning.

thickertoofan@lemm.ee · 1 month ago

Ooof. 700mb discs

thickertoofan@lemm.ee · 1 month ago

Everything was. Is …

thickertoofan@lemm.ee · 1 month ago

Whyyy???

thickertoofan@lemm.ee · 1 month ago

Welcome here!

thickertoofan@lemm.ee · edit-2 1 month ago

How do I segment this QR code from the white background using opencv

thickertoofan@lemm.ee · edit-2 1 month ago

How do I segment this QR code from the white background using CV

thickertoofan@lemm.ee · 1 month ago

Soon you will be able to run LLMs natively in docker

thickertoofan@lemm.ee · 1 month ago

I think the bigger bottleneck is SLAM, running that is intensive, it wont directly run on video, and SLAM is tough i guess, reading the repo doesn’t give any clues of it being able to run on CPU inference.