IMO, this announcement is far less significant than people make it out to be. The feature has been available as a private beta for a good few months, and as a public beta (with a waitlist) for the last few weeks. Most of the blind people I know (including myself) already have access and are pretty familiar with it by now.
I don't think this will replace human volunteers for now, but it's definitely a tool that can augment them. I've used the volunteer side of Be My AI quite a few times, but I only resort to that solution when I have no other option. Bothering a random human multiple times a day with my problems really doesn't feel like something I want to do. There are situations when you either don't need 100% certainty or know roughly what to expect and can detect hallucinations yourself. For example, when you have a few boxes that look exactly the same and you know exactly what they contain but not which box is which, Be My AI is a good solution. If it answers your question, that's great, if it hallucinates, you know that your box can only be one of a few things, so you'll probably catch that. Another interesting use case is random pictures shared to a group or Slack channel, it's good enough to let you distinguish between funny memes and screenshots of important announcements that merit further human attention, and perhaps a request for alt text.
This isn't a perfect tool for sure, but it's definitely pretty helpful if you know how to use it right. All these anti-AI sentiments are really unwarranted in this case IMO.
>Bothering a random human multiple times a day with my problems really doesn't feel like something I want to do.
Please don't feel like you're bothering us. I've had this app for years and absolutely cherished the few calls I've gotten. I get really bummed if I miss a call.
Have you ever needed to do something like make an appointment but kept putting it off because you just really didn't want to talk on the phone?
That can happen to anyone. Some blind people are introverts and don't want to talk to random strangers all the time.
Also, while the vast majority of volunteers have the best intentions and try hard to be helpful, you never know what you're going to get. Some are way too chatty, some offer unsolicited advice.
Calling a human being is much higher-friction than just opening an app, if that weren't the case, we' all still be calling restaurants instead of ordering on Uber Eats.
I also would prefer not to call volunteers at night. This isn't much of a problem if you live in the US, as the app is segregated by language, not country, so you'll probably find somebody down under. I have an additional complication of having to deal with foreign language content because of where I live, so English often isn't good enough for me.
OpenAI announced from day 1 that GPT-4 is multimodal, so it was mostly waiting for safety censorship and enough GPUs to be available for mass rollout.
This won't entirely replace human volunteers, but these models get rapidly better over time. What you are seeing today is a mere toy compared to the multimodals you'll get in the future.
Currently there's no model trained on videos, due to large size of videos, but in the future there will be video-capable models, which means they can understand and interpret motion and physics. Put that in a smart glass, and it can act as live-eyes to navigate a busy street. Granted this will take years to bring the costs down to make that viable.
We had enough drama with BeMyAI refusing to recognize faces (including faces of famous people) as it were. If sighted people have the right of accessing porn and sexting, why shouldn't we? Who should dictate what content is "appropriate", and what about cultures with different opinions on the subject?
> Who should dictate what content is "appropriate", and what about cultures with different opinions on the subject?
OpenAI should dictate that, because GPT 4 belongs to them. So they decide what kind of service they're interested in offering.
There will be plenty of other powerful LLMs that can be used in the near future. Some will be more restrictive, some will be less. If you want fewer restrictions, you will be able to pick one that offers that for you.
> There will be plenty of other powerful LLMs that can be used in the near future.
Extremely optimistic take. What tends to happen is you get centralisation, and regulatory capture ensures the largest players dictate what is an acceptable to the incumbents who wish to do things differently.
I mean in theory you can go set up your own social network or video sharing site with whatever rules you like, but you should assume government regulators and big tech will attack you if you do so and believe in the principles of free speech, or simply wish to create a safe-space for conspiracy theorists.
>> the future there will be video-capable models, which means they can understand and interpret motion and physics.
Videos may not suffice. Videos are 2d, with 3d aspects being inferred from that 2d data, which is an issue for autonomous driving based on cameras. A proper model for AI training would be 3d scans rather than videos. The best data set would be a combination of video and 3d scanning. Self-driving cars which might combine video with radar/laser scanning may one day provide such a data set.
There is talk of a 3d version of Google streetview, one using a pair of cameras to allow true VR viewing. That might also be good training data as it will capture, in 3d, may street scenes as they unfold.
I’ve actually been fairly impressed with how far monocular depth estimation has come in a few years. I think that it should not be relied upon in self driving cars because they need closer to 100% accuracy and also almost no latency, and the SoTA ones are too slow to be run every frame currently. Not only that but it’s a life-or-death situation where cutting on accuracy % to save BoM costs seems ridiculous to me.
But in a higher-acceptable-latency, lower-risk environment like this, I am actually quite bullish on camera alone methods. Video understanding has come a long way.
Just wanted to say, not talking whatever you feel you should do, but me and others I know don't feel at all like you are 'Bothering a random human multiple times a day'. I personally feel extremely lucky when I get a 'call' and am able to help.
Not much else regarding your comment, just wanted to let you know that I have had very satisfying 'calls' and I am always looking for the next one and always happy when I do.
Almost all blind people usually wear a dark shade of glasses. So, instead of the phone doing the camera’s work, what about a tiny Google-Glass-esque snap-on to the Blind person’s glasses that feeds to the phone for processing?
A few companies are trying it. Last weekend, I tried one, and it was not yet polished, but good enough to start. It recognized me (my friend already has me on his phone contact).
I’m trying to help a friend with his non-profit initiative, bringing the cost to about a third or even a-fourth of Google Glass[1]. After reading the article and the other First Impressions with GPT-4V(ision)[2], it is apparent that this can be possible much simpler and soon enough. The rumor about Ive[3] and Sam[4] talking about AI in Hardware has already given me some good hope.
If anyone else does AI-enabled hardware assistance for blind people and has devices that will be less than $500 a piece in retail, I’d love to talk and introduce them to the right people.
I'm the Founder & CTO of Envision. We're building EXACTLY the product you're describing. Envision Glasses is a bunch of computer vision tools built on top of the Google Glass Enterprise Edition 2. We've more than 2000 visually impaired people, across the world, using the Glasses in more than 30 different langauges.
P.S: I know the EE2 has been discontinued but we've been working closely with Google to ensure current and future demand is met. We're also experimenting a lot with other exciting off-the-shelf glasses that I can't talk about here but I'm super excited for this whole glasses + AI space!
It was one of the best apps out there for (instant) text recognition, and I was pretty happy to pay for it, but since it went free, it's really not the same. There's nothing else like the old Envision out there.
Also, not an issue for me personally, but dropping support for Cyrillic in the middle of a brutal war in Ukraine, in an automatic update released with no prior warning, was an asshole move if I ever saw one.
The main problem with the glasses, in my view, is the cost. It means those of us in developing countries cannot afford it, meaning its only available to a select few. This seems to be common for a lot of assistive technology solutions, even software.
I think without a major consumer product as a platform, it will never reach the number of people it wants to help. AS in example take what the iPhone did to the market of braille note takers.
I have been beta-testing BeMyAI. For roughly a month. It is a milestone in independence. It typically describes scenes in a very usefu way, allows me to ask detail questions, can compare differences in pictures, has a very good OCR and can aso translate foreign languages. Just a few examples:
* It can help reading a menu. But not just straight linear reading... I told it I prefer veggie today, but would also eat something light with meat. So it highlighted the veggie options for me, and in one case even tried to guess the type of dish from a photo which lacked a text description.
* I had it search for cobwebs on a ceiling. Vacuumed them away, and took a second picture, asking if they were gone now.
* It told me a houseplant of mine has yellow leaves, and told me which by describing a path via the stems.
In general, the scene description is very good, and the built-in (implicit) OCR is just a game changer. It is like having a real human reading something to you. Typically, you dont want the human to read all the text on the page, typically you are only interested in some detail, and usually instruct the helper about that so they dont have to read everything to you. The same now wors with an AI, with consistent quality.
Volunteers are great, dont get me wrong, but this comes with a lot of problems. You sometimes have to request help several times, because frankly, sometimes you simply get people who can not help you due to their own abilities. Also, the video feed is demanding on connectivity. If I try to read a menu in the train with volunteers, most of the time this will fail due to "blurry vision" and the camera dropping out. Sending a single picture every few minutes is totally OK in these situations.
> Be My AI is perfect for all those circumstances when you want a quick solution or you don’t feel like talking to another person to get visual assistance.
It presumably will be getting used in low risk situations.
Well, I'd say it is about as dangerous as trusting a random stranger is. Or do you seriously believe blind people have never been trolled by fellow humans?
Please dont project the typical AI-angst into this assistive technology. It doesnt deserve this, especially from people without hands-on experience.
That is exactly the situation where I would NOT trust AI and I feel like the app would refuse to do it too.
I don't know about other countries, but here in Switzerland I've had the app for five years and got only 4 requests for help during that time, which made me think that there were way more helpers than help-seekers. But I suppose they wouldn't be adding AI if that was the case.
I've noticed that if you haven't gotten a call in a long time if you open the app you seem to get calls in the next few days. Not sure if its an actual behavior or my perception.
Why shouldn't they push responsibility to the user? Blindness isn't a mental handicap, their users are rational adults who are far more familiar with the risks of being unable to see than you are. Why shouldn't they have access to one more tool that they can make a judgement call on when to use? And what's wrong with the provider of the tool giving them guidance on when might be a good time to ask a human instead?
I suspect most of the anger in this thread is just the usual anti-AI stance, but it honestly feels extremely patronizing in this context.
Indeed, but there is a very different risk profile in this sort of situation. A human might be able to say: "I can't see that label very clearly, but I think it says X...", and the user might make their risk assessment based on that implied (lack of) confidence. This tends to be a weakness in AI solutions, where there isn't often good feedback on the model's own estimate of its error rate.
On the plus side, I suppose in this application with the humans there's always a risk of a malicious volunteer outright lying, which presumably isn't a risk with the AI solution.
Yeah, but the alternative right now is completely random, unverified, human volunteers with no qualifications and no training. While the vast majority have the best of intentions, some people are just...not that good at helping.
I've been playing with Be My AI, it's excellent. It's got a really good prompt that enables it to be really descriptive and helpful with very little hallucination. It's actually a lot better at describing most photos than the average person is.
Making the world more accessible to blind people is a big challenge. There are no perfect solutions, but the solutions we do have today are enabling millions of blind people to have full careers and live independently.
This is a great improvement. It's not perfect, but neither were the previous technologies that blind people have been using. Don't let the perfect be the enemy of the good.
I don't agree with this comparison. I'm not OK with being forced into the center of the distribution when being compared to the general population. Just because an AI may be better than most humans at most things, this does not extrapolate to it being better than me at some specific task. Additionally, I believe there is a world of difference, ethically speaking, between me causing myself harm due to some mistake I made and a machine causing me harm because of a design flaw.
It's besides the point, the point is this is an "AI" based product and generally tech is supposed to be an improvement on the status quo, the fact that "x might also have issues or blind spots" isn't really relevant?
This argument that "humans are flawed too" is not useful, we know humans are flawed, that's why we want tech to fill those gaps, not flawed / potentially dangerous tech, actual tech like how a plane flies you to your destination 99.9% of the time.
This is an improvement over the status quo—there are several blind people who have commented on this thread saying that they are reluctant to use the human service unless they absolutely have to. Giving them an AI tool that can use in low-stakes situations is a huge improvement over not having a tool at all, and the human service is still there for the high-stakes situations they were already willing to use it for.
Also, there are many low-stakes situations where AI is actually quite a bit better than the average human:
1. Take a photo of a piece of tech: AI can not only describe what's going on but tell you exactly how to fix it (e.g. "it's showing a fatal error. hold down the power button for 10 seconds to reboot it")
2. Take a photo of clothing: AI uses consistent, neutral language to describe garments, while humans would all describe the same item differently based on where they live, their own personal style, their region, etc.
3. Take a photo in a major city or near a famous landmark, AI will recognize it and tell you about your location, whereas a random human has probably never been there before
4. Take a photo of an insurance bill, AI can explain what some of the technical terms mean if you don't understand them
5. Take a photo of a menu, ask it to summarize all of the vegetarian entrees and it will list them in seconds ("one veggie burger, one stir fry with tofu, three sandwiches and all of the salads"); a human might take a minute or two to carefully read the menu and give a much longer answer
To be fair, humans make mistakes at an incredible rate. Even the ones who make life or death decisions.
I'm talking about the constant stating of the obvious, if these models weren't useful we wouldn't deploy them. It's like the whole internet constantly making the statement "cars are better at covering distances in shorter time frames than humans"...we know, that's why we build them and pay a bunch of money for them.
I used to be a personal support worker for adults with cognitive and physical disabilities.
In some sense, I was a non-AI solution that enabled them to do their activities of daily living.
I'm pretty sure they'd have preferred not to be dependent on me, though. I imagine that's true here, too--no more need to edit your behaviour/filter your thinking/navigate interpersonal stuff just to do something minor.
As A blind person, hopefully this never happens. But if it does, hopefully the person who decided to use the AI improperly. I know this might sound crazy, but maybe, just maybe, we should allow blind people themselves to take responsibility for the ways in which they use technology. Rather than locking up the dangerous AI that they might hurt themselves with, perhaps we might just allow blind people themselves to determine how much risk they are comfortable with taking.
> Be My AI is perfect for all those circumstances when you want a quick solution or you don’t feel like talking to another person to get visual assistance.
I think as all in life it's situational. Would I trust imperfect AI to pick medicine out of 1000 available in a drugstore? Probably not - what if it picks Fentanyl (or whatever drug that can kill me). Would I trust it to pick Ibuprofen to treat headache from my home medical cabinet? Absolutely. There is nothing there that can kill me. Would I trust it to tell me dose in mg? Current systems are already better at OCR than average human.
This summer I used Google Translate to pick medicine in Italy and it was pretty good at translating labels - definitely better than pharmacist who did not speak English at all.
By the way, lots of people die in US because wrong medicine was dispensed - and that has nothing to do with AI. People are imperfect and many drug names are long, incomprehensible and easy to confuse with each other.
Lots of blind people have their homes set up with everything carefully arranged and no hazards. They prepare food, clean their house, commute to school, to jobs, and to appointments via familiar routes with no dangerous intersections, or they might use buses, trains, and Uber / Lyft just like millions of other people who aren't blind.
Being blind can be full of annoyances and frustrations, but I think it's a stretch to say that it's "dangerous".
Here we have a tool that can materially improve blind people’s lives and instead of making it accessible you want to deny them access until the product reaches a quality bar that is impossible to reach.
That’s wrong and immoral. The world is full of risk. As individuals we choose to accept some risks and move on with our lives. I think it’s appropriate that blind people are given the same opportunity
I've been a Be My Eyes volunteer now for a couple of years. I've helped about 5 people and it's been very fulfilling. I jump with delight when I get a call because they are rare (thanks to a lot of volunteers) and the people I help are always so kind.
Two things I've helped with:
Wrapping a gift by helping them orientate the gift correctly (I don't even do this!)
As a user, my biggest concern is that it won’t describe certain content, such as adult content or something that might be considered offensive.. I feel like I should be able full control over the output.
On the other hand, I’d like to be able to test its ability to describe graphics to me. If it’s able to turn graphics into an accessible table, I can browse with my screen reader that would be revolutionary.
Reminds me of when I was young little teenage shithead with shithead friends who discovered the text-to-talk support provided for deaf landline telephone use. You would go to a website and enter a number you would like to call. Then a human operator would read out what you typed to the person you called.
As we quickly learned, the operator would say anything. Anything. So for a few days we would call each other with the operator and never were able to find the limit. I have great respect now, those operators were inhumanly stone faced, and respect that the system was perfectly transparent. Nothing typed was hidden or obfuscated.
There was a programme the other day about this on BBC Radio 4. Unless they’ve resolved the problem already, it will refuse to describe any scene with faces in it, for privacy reasons. Which obviously rules out a lot of useful use-cases. You’re in a cafe wondering what’s going on, Be My AI can’t help if there’s a face. I think it was related to some EU legislation which Be My Eyes are now at least trying to change for this use-case. I wish them the best of luck.
Update: Just tried this again, and it looks like they've loosened the restrictions. It will now describe people again. Yay! It just won't try to identify who someone is.
Original comment: Yeah, this wasn't always the case, until recently. People were using it to describe their kids, spouses, etc. Pissed a lot of folks off when they disabled it. I never even thought of using it for that, so now that I realize I could have, it kinda' pisses me off as well. Honestly, I never cared too much what they looked like, but it would have been interesting to hear the ChatGPT viewpoint. :)
But definitely not something I could have asked a fellow human about without it seeming really weird, and being confident I wouldn't get a biased answer. Though it's probably unrealistic to expect that the AI wouldn't also give a biased answer.
I don't see the point of this. This app has a massively disproportionate number of helpers versus those needing assistance. Personally, I've had the app installed for 3 years and never once was asked to assist anyone. Why take the risk of AI providing false info and injuring someone when there's willing and able humans at the ready?
Are there a large number of users that feel like they are wasting volunteer time with menial tasks?
Strictly speaking, Just because the app had excess helpers doesn't necessarily mean the visually impaired users wouldn't like assistance more often, just that they wouldn't bother others about it.
Probably plenty of times you might want to ask the question specifically to something as impersonal as a machine. ie: how's this itchy spot I've had look?
I've noticed that people like to ring up their private purchases at the pharmacy using the self-checkout. I wonder if purchases of such items have increased since the self-checkout option was opened, as people shift their purchases from other pharmacies to this one.
Considering how often I’ve seen the complaint from your parent post, it’s quite clear people don’t mind. Quite the opposite, they’d embrace the opportunity. Maybe the people who need assistance don’t realise that, but again, that complaint is quite common. I’d like to help but never signed up specifically because of that surplus.
So they had a solution based on humans who are eager to help and are replacing it with an automated system which when mistaken can have disastrous results and cause personal injury. Seems odd to me. A humanised approach is often seen as a positive and this cuts it out without necessity.
All that said, I don’t have any insider information. Perhaps the people who need assistant do prefer talking to a machine.
I would personally feel pretty guilty about making use of the tool knowing i'm using someone else's time. Also privacy reasons I guess, it feels like there's a difference between a person seeing your photo and an ai seeing it.
There's a blind person upthread who concurs that they avoid using it unless absolutely necessary [0]. I sympathize: Even if someone assures me that they're happy to do something for me, it's hard to believe they're not acting out of a sense of duty. This is probably especially true for someone who has a specific disability that they know breeds sympathy.
It sounds like as part of the sign-up they should ask an experienced helper to make an "artificial" request of the new signee, to give them a chance to see the workflow and to provide an opportunity ask questions of someone who's been in the role that they've signed up for. If there's really an excess of helpers, generating one extra request per new future helper seems reasonable.
I’d guess that they have a rating system and prefer assigning request to helpers who reliably perform well… which might be why I didn’t get repeat requests after my initial fumbling.
Sure, and that's an important tool for improving quality in the face of different skill levels / levels of effort. But that's independent of investing a bit in improving the skill level across the helper base; as you experienced, it's something where experience helps, and if your first time in the real flow is with an actual request, there's no way to avoid the helper needing to deal with both helping and learning simultaneously.
Thanks for being a volunteer. But please dont judge blind users if you apparently can not put yourself in ther shoes...
Yes, one reason is that some blind users dont want to waste volunteer time. Another reason is that volunteers are different, but the performance of the image AI is predictable. Another reason is that the AI OCR is fast, can also translate, and, surprise, the text is easier to handle later, for copy&paste.
Besides, the performance of the AI describing pictures is, sorry to say, a little bit above what the typical human is willing or capable of doing. IOW, some humans performs worse then the AI. Also, camera access is different from picture taking. I use volunteers when I need more interactive help, but I totally prefer the AI when I just need a single pic.
I was sighted volunteer on a call with a fellow using a treadmill touchscreen. He already knew the menu flow but the UX was dynamic and it wasn't his screen or he could put locator dots on it (Lesson to all designers! Hardware can have physical buttons!). Our interaction was mostly him stating his goal determining the screen's starting state and then where UI elements were, and I would feed back his finger position like "a little left ... no, too far, now up a little ... ok hit it."
I think we can imagine an AI could describe the screen, and even find non-language visual elements if asked explicitly, like arrows or turtle vs hare icons etc. But is it ready to have shared context of how people need to interact with that UI?
I've had it for five years and have had 8 or 9 calls, so you're right about it being very infrequent. You need to be mindful, at least on Android, that as the application is infrequently used that the OS may remove permissions. This feature can be disabled for Be My Eyes specifically.
As for menial tasks, I could definitely see people wanting to use this instead of calling a stranger for more personal matters -- at least initially.
Thanks for being a volunteer. For me as a blind person, there still is that sense that I'm bothering someone when I ask for human assistance. I realize that this is not really rational, given how much people seem to appreciate the opportunity to help, but it takes some effort to overcome this in my own brain regardless.
I wanted to let you know that I _race_ to answer a call as quickly as possible. If I didn't want to be called I never would have installed the app in the first place.
No, this is genuinely a useful technology. Blind people are getting a ton of value out of it. It's way faster than humans, you can do it quietly without bothering anyone, and honestly it's better at describing some things than 90% of humans are.
There are a lot of things that are just tech demos. This is far more than that.
You apparently have no personal experience wth needing help. Can you pleas take your AI-angst somewhere else, and leave the best innovation that assistive technologies have ever made, for those to discuss which actually know what they are talking about?
This sort of dismissive comment is very anti-social and full of hidden hatred. Projecting your squarrels with a company onto people that really need the help provided.
And before you click, I am that pissed because I am blind. You have absolutenly no idea what that means, and what BeMyEyes and BeMyAI did for us. Just go home and hate someone else please.
I'm sorry, but after seeing numerous self driving cars plow into people and no clear testing about what's safe when the platform already provides instant access to a live human, I'm really worried about technologies like this.
That death is of course tragic, but it's important to put it in context.
That's the only death of a self-driving car. That's the only one. There was a safety driver with their hands on the wheel, and they didn't see the pedestrian either. And that was an Uber self-driving car, and they've since canceled that project.
While incredibly tragic, it's not at all fair to say that self-driving cars are constantly plowing into people.
Waymo has driven more than 1 million miles with no human injuries. That's dramatically better than any human driver.
Cruise is a close second in number of miles, also with no human injuries.
Be My Eyes has existed since 2015. If trolling is common, there should be reports of it online. A cursory search hasn’t revealed anything, but maybe you’ll find something.
Trolling isn't a big problem, but many humans have good intentions but just aren't that helpful. Some want to be chatty or give unsolicited advice. Some just aren't very good at tech. Some just aren't very good at describing things clearly.
Is this some sort of sarcastic joke? No, I absolutely don't think that people who are unable to see could know whether the thing they cannot see could accurately be described by an AI.
Can you give an example of the dangerous situation you are imagining here? I do not see how this app could be dangerous unless combined with poor judgment and failure to follow instructions (e.g., do not use to read medicine labels, do not use to cross the road).
I've answered maybe 10 Be My Eyes calls over the last couple of years. I can see some value with AI describing labels or food items however most of the calls I've answered are more nuanced. Unusually I've answered two calls this week, 1st one was looking at photos of a hotel to help decide if it was met the requirements of the person. The second call was helping someone perform a blood sugar test, I had to tell the person when I thought the drop of blood on the tip of their finger was large enough to test and read the result off the tester. Neither of these are candidates for AI but let the users be the ultimate judge, I am continually impressed by the ingenuity and resourcefulness of the people I have interacted with.
This isn't trying to replace human volunteers, but complement them.
I know blind people who are making just as many human Be My Eyes calls as they were before, but they're using Be My AI even more, for things where they wouldn't have even bothered to use the service at all before - because the AI is so fast and convenient.
I am surprised by your assessment that these are not tasks for the AI. Well, the first one is troublesome, but judging the shape of a drop of liquid according to well established procedure sounds quite on par.
Maybe but it was on the tip of a finger in a moving image changing size as the person squeezed their finger, also something that I never previously considered, blind people have tend to have no/little artificial light on when alone, luckily the app allows the person providing the assistance turn on the flash on the other persons phone.
This is great. Many years back a group of us developed an OCR app on Android, at the time the market was not saturated and we found a way to deliver what was close to state-of-the-art if a bit expensive at the time service. We spent a lot of effort on pre-processing and user education through the app to try and maximize chance of good OCR. It was pretty good. We managed to have a few thousand users and made OK $ but nothing near life changing. Then the market got competitive and saturated and also our focus started to slip to other things.
What we found is that a small cohort of our users were in some way visually impaired and were very reliant on the app. We ended up consciously deciding to focus exclusively on this cohort which is not a typical product management strategy (assuming your trying to max $) but the remaining enhancements we did made some lives better and we were happy with that.
I love being a Be My Eyes "eyes". I'll stop anything to answer that call and help a blind person.
Adding AI for basic things would probably be a game-changer for blind people who have lots of questions, or embarrassing questions, or want to read their credit card number or something.
> If you had to choose a pathway to fix the blindness issue, which one would you choose?
But we don't need to choose. We are not living in some Age of Empires like game where the town centre can only develop one item from the tech tree at a time.
The set of people who can develop this are entirely different from the set of people who can work on a bioengineered cornea. The pot of money this is financed from is not the same as the one we would be financing those other projects.
Finally there are many separate biological problems which can cause blindness. A bioengineered cornea might help with some of them, but not others. Even if we would have a truly amazing and cheap and reliable bioengineered cornea we would still have blind people whom it would be unable to help. Because of this it is worthwhile to work on both.
Pros:
1. AI is way more proven of an investment. There's an extremely clear path forward where money = better vision capabilities.
2. AI is extremely cost efficient and cheap to mass deploy. Blind users can get access with a cheap monthly subscription of $20, easy to afford for vast majority of developed world. Compare this with a cyber-cornea, which even if it worked, even if it didn't cost $10k to make one each, would still have massive costs just for installation, and benefit a tiny amount of relatively wealthy disabled.
3. AI works for universal blindness. A cornea can only fix problems inside the eyeball itself.
OpenAI whisper api has 50 request per min rate limit . It means that I can’t have more than 50 concurrent users . If you are building a consumer app, this seems like a no go. I believe gpt4 has similar rate limits. How do you work wine that rate limit to server tens of thousands of users
Are there already solutions that allow a blind person to walk around in a city and the smartphone tells them to go left/right/forward/stop, or is anyone working on them?
I'm also curious about purely AI based versus LiDAR solutions.
This might be more convenient than braille in a few circumstances, like replacing a braille menu in a restaurant - but it definitely doesn't make braille obsolete.
Also, note that many blind users access Be My Eyes using a refreshable braille display.
They have! There's a Stable-Diffusion-enhanced version of Zork[1] out there, and there are also extensions[2] for Oobabooga which facilitate dynamic adventures. Also, this point-and-click adventure game[3] is pretty fun if you can get it to run.
Doesn't work, SD's comprehension is too weak to make complex scenes.
Now DALLE-3 is good enough, problem is if it'll be cheap enough. 100% there are revolutionary new forms of adventure games being developed based on it.
Tangentially related: a friend made a Discord bot that automatically generates D&D items, NPCs, monsters, etc. with images and full descriptions. It uses neural.love and Poe for images and text generation, respectively.
>Finally, online text-based role-playing games brought to life!
Not yet, OpenAI is too preachy , if your adventure would bring you into a bar and try to drink something you will get a few paragraphs about "alcohol is bad", and is also very "child limited/targeted" so if a monster spawn , the hero will most likely befriend the monster will love and they will live happily ever after.
The open uncensored ones are still WIP last tiem I checked, they either forget the conversation from a few moments ago, or the training made them dumb.
If I am wrong then someone tell me where I can try a demo for such a text adventure that has memory and is not crippled like it targets "small american children".
The intent is more about avoiding legal and political risk. Even a grocery chain's LLM recipe generator gets bad press. I guess they need to warn you that drinking bleach is bad.
I normally test LLMs by providing a list of bizarre "weapons" and asking how I could use them to defeat an improbable beast.
It turns out that an enormous dust mite the size of a car needs a disclaimer when your weapons are a comb, a plastic horseshoe, an etch-a-sketch, a baseball glove, and a worn copy of the farmer's almanac. That being said, ChatGPT still tends to wins the creativity test.
You could use relatively low res images to generate "accurate" ascii for a place
It would be relatively lightweight but still "realistic"
I'm not well versed on Ascii to image conversion, but I guess you could use AI to segment and recognize objects too complex to recognize easily and adjust "image" generation accordingly
AI is plainly revolutionary for the blind. You don't need 'high level prosthetics', you just need a smartphone with a camera, and you can now see the world.
AI is also revolutionary for the deaf, neural speech to text is getting so good, that soon 'hearing aids' will be secondary, you can just read from a screen with live subtitles.
You are pretty much making stuff up with " Will it become the accessible norm?Nope." ChatGPT plus only costs $20 a month. How more 'accessible' can it possibly get? Has there ever ever ever been a technology for the deaf and blind as revolutionary AND cheap as this?
If you have some anti-corporate agenda, fine, but self-righteousness gives you no right to lie.
And all it's going to cost the rest of us is our ability to free and critically think.
No biggie.
Accessibility is accountability. How accountable is 'OpenAI' to anyone other than it's massive investor Microsoft? You want me to be self-righteous so you can be dismissive of the actual path we are travelling on, likely because you benefit from it; and most likely not because you're blind.
"Anti-corporate agenda" is a pro-human perspective. Difficult to call people liars without being utterly delusional, when there is a clear track record of corporate abuse of purpose.
"Don't be evil!" That turned out well, huh?
Life loves irony; and the truth is, we will grant the blind vision only for them to see it all collapse. Congrats on being a real champion of the disabled.
> hearing aids' will be secondary, you can just read from a screen with live subtitles
I already do this on Discord, my bot listens to discord calls, saves the audio in chunks, send them off to whisper and I get a neat tidy live-ish transcription. It's a game-changer honestly because combination APD and ADHD being able to go back in the conversion and see what I missed or came out garbled is amazing. Whisper is so good compared to the previous state-of-art it's nuts. It understands tone, slang. My next project is to see if I can get it to detect different speakers.
In general I think you're right, but maybe Be My Eyes is not the best example? The company has been around since 2015 helping people since day one, hasn't seem to shift focus in any of these 8 years, and has a huge amount of users using it daily to be helped.
>>How are photos and chat interactions processed and stored, and who will have access to my data?
Be My Eyes initially processes your photos and chats to (a) send them to Open AI for further processing, (b) provide Be My AI’s descriptions, and (c) give you follow up options. Open AI processes the photos and chats to provide the AI response.
During our beta testing, Be My Eyes stores all photos and chat interactions on encrypted servers so that we can review them, if necessary, based on tester feedback. That helps us build a better service based on your feedback. It also allows us to address any safety or content issues that may come up. Open AI does not store the photos or chats after processing them. None of your photos or chats are being used to train the AI.
The storage and use of your photos and chats by Be My Eyes are governed by our privacy policy.
That, does not mean the data isn't being mined for data, or processed into other models. The data may not be directly training AI, that doesn't mean the correlations aren't. Also, the key clarification here is "During Beta Testing".
>>Governed by our privacy policy:
>>>>The one exception is that if you use Be My AI or another Service powered by third-party artificial intelligence technology, and the images or video you submit contain personal information, that information could be processed by our third-party provider to train and improve the artificial intelligence technology.
>>>>We also work with third-party ad networks and advertising partners to deliver advertising and personalized content on other websites and services, across other devices, and possibly on our Services. These parties may collect information directly from a browser or device when an individual visits our Services through cookies or other data collection technologies.
Agree with other comments. The company has been providing real tangible value to its users via a very dense network of volunteers. I know that because my wife has been using it (as volunteer) and helping people very frequently. I hope this experiment with AI is successful for them.
IMO, this announcement is far less significant than people make it out to be. The feature has been available as a private beta for a good few months, and as a public beta (with a waitlist) for the last few weeks. Most of the blind people I know (including myself) already have access and are pretty familiar with it by now.
I don't think this will replace human volunteers for now, but it's definitely a tool that can augment them. I've used the volunteer side of Be My AI quite a few times, but I only resort to that solution when I have no other option. Bothering a random human multiple times a day with my problems really doesn't feel like something I want to do. There are situations when you either don't need 100% certainty or know roughly what to expect and can detect hallucinations yourself. For example, when you have a few boxes that look exactly the same and you know exactly what they contain but not which box is which, Be My AI is a good solution. If it answers your question, that's great, if it hallucinates, you know that your box can only be one of a few things, so you'll probably catch that. Another interesting use case is random pictures shared to a group or Slack channel, it's good enough to let you distinguish between funny memes and screenshots of important announcements that merit further human attention, and perhaps a request for alt text.
This isn't a perfect tool for sure, but it's definitely pretty helpful if you know how to use it right. All these anti-AI sentiments are really unwarranted in this case IMO.
I've written more here https://dragonscave.space/@miki/111018682169530098